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Abstract 



This is the second of three planned papers describing zap, a satisfiabihty engine that 
substantially generalizes existing tools while retaining the performance characteristics of 
modern high performance solvers. The fundamental idea underlying ZAP is that many 
problems passed to such engines contain rich internal structure that is obscured by the 
Boolean representation used; our goal is to define a representation in which this structure 
is apparent and can easily be exploited to improve computational performance. This paper 
presents the theoretical basis for the ideas underlying ZAP, arguing that existing ideas in 
this area exploit a single, recurring structure in that multiple database axioms can be 
obtained by operating on a single axiom using a subgroup of the group of permutations on 
the literals in the problem. We argue that the group structure precisely captures the general 
structure at which earlier approaches hinted, and give numerous examples of its use. We go 
on to extend the Davis-Putnam-Logcmann-Lovcland inference procedure to this broader 
setting, and show that earlier computational improvements are either subsumed or left 
intact by the new method. The third paper in this series discusses zap's implementation 
and presents experimental performance results. 

1. Introduction 

This is the second of a planned series of three papers describing ZAP, a satisfiability engine 
that substantially generalizes existing tools while retaining the performance characteristics 
of existing high-performance solvers such as zChaff (Moskewicz, Madigan, Zhao, Zhang, 
& Malik, 2001).^ In the first paper (Dixon, Ginsberg, &; Parkes, 2004b), to which we 
will refer as ZAPl, we discussed a variety of existing computational improvements to the 

1. The first paper has appeared in ,IAIR; the third is currently available as a technical report (Dixon, 
Ginsberg, Hofer, Luks, & Parkes, 2004a) but has not yet been peer reviewed. 

(5)2004 AI Access Foundation. All rights reserved. 
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Davis-Putnam-Logemann-Loveland (dpll) inference procedure, eventually producing the 
following table. The rows and columns are described on this page and the next. 





efficiency 
of rep'n 


proof 
length 


resolution 


propagation 
technique 


lecirning 
method 


SAT 




EEE 




watched literals 


relevance 


cardinality 


exponential 


P?E 


not unique 


watched literals 


relevance 


pseudo- 
Boolean 


exponential 


P?E 


unique 


watched literals 


+ strengthening 


symmetry 




EEE* 


not believed in P 


same as SAT 


same as SAT 


QPROP 


exponential 


??? 


in P using reasons 


exp improvement 


+ first-order 



The rows of the table correspond to observations regarding existing representations used 
in satisfiability research, as reflected in the labels in the first column:^ 

1. SAT refers to conventional Boolean satisfiability work, representing information as 
conjunctions of disjunctions of literals (cnf). 

2. cardinality refers to the use of "counting" clauses; if we think of a conventional 
disjunction of literals Vjlj as 

i 

then a cardinality clause is one of the form 

i 

for a positive integer k. 

3. pseudo-Boolean clauses extend cardinality clauses by allowing the literals in ques- 
tion to be weighted: 

Wik > k 

i 

Each Wi is a positive integer giving the weight to be assigned to the associated literal. 

4. symmetry involves the introduction of techniques that arc designed to explicitly 
exploit local or global symmetries in the problem being solved. 

5. QPROP deals with universally quantified formulae where all of the quantifications 
are over finite domains of known size. 

The columns in the table measure the performance of the various systems against a 
variety of metrics: 

2. Please sec the preceding paper ZAPl (Dixon ct al., 2004b) for a fuller explanation and for a relatively 
comprehensive list of references where the earlier work is discussed. 
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1. Efficiency of representation measures the extent to which a single axiom in a 
proposed framework can replace many in CNF. For cardinality, pseudo-Boolean and 
quantified languages, it is possible that exponential savings are achieved. We argued 
that such savings were possible but relatively unlikely for cardinality and pseudo- 
Boolean encodings but were relatively likely for QPROP. 

2. Proof length gives the minimum proof length for the representation on three classes 
of problems: the pigeonhole problem, parity problems due to Tseitin (1970) and 
clique coloring problems (Pudlak, 1997). An E indicates exponential proof length; 
P indicates polynomial length. While symmetry-exploitation techniques can provide 
polynomial-length proofs in certain instances, the method is so brittle against changes 
in the axiomatization that we do not regard this as a polynomial approach in general. 

3. Resolution indicates the extent to which resolution can be lifted to a broader set- 
ting. This is straightforward in the pseudo-Boolean case; cardinality clauses have the 

problem that the most natural resolvent of two cardinality clauses may not be a car- 
dinality clause, and there may be many cardinality clauses that could be derived as 
a result. Systems that exploit local symmetries must search for such symmetries at 
each inference step, a problem that is not believed to be in P. Provided that reasons 
are maintained, inference remains well defined for quantified axioms, requiring only 
the introduction of a linear complexity unification step. 

4. Propagation technique describes the techniques used to draw conclusions from 
an existing partial assignment of values to variables. For all of the systems except 
QPROP, Zhang and Stickel's watched literals idea (Moskewicz et al., 2001; Zhang k, 
Stickel, 2000) is the most efficient mechanism known. This approach cannot be lifted 
to QPROP, but a somewhat simpler method can be lifted and average-case exponential 
savings obtained as a result (Ginsberg h Parkes, 2000). 

5. Learning method gives the technique typically used to save conclusions as the in- 
ference proceeds. In general, relevance-bounded learning (Bayardo &; Miranker, 1996; 
Bayardo &; Schrag, 1997; Ginsberg, 1993) is the most effective technique known here. 
It can be augmented with strengthening (Guignard & Spielberg, 1981; Savelsbergh, 
1994) in the pseudo-Boolean case and with first-order reasoning if quantified formulae 
are present. 

Our goal in the current paper is to add a single line to the above table: 
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efficiency 
of rep'n 


proof 
length 


resolution 


propagation 
technique 


learning 
method 


SAT 




EEE 




watched literals 


relevance 


cardinality 


exponential 


P?E 


not unique 


watched literals 


relevance 


pseudo- 
Boolean 


exponential 


P?E 


unique 


watched literals 


+ strengthening 


symmetry 




EEE* 


not believed in P 


same as SAT 


same as SAT 


QPROP 


exponential 


??? 


in P using reasons 


exp improvement 


+ first-order 


ZAP 


exponential 


PPP 


in P using reasons 


watched literals, 
exp improvement 


+ first-order 
-|- parity 
+ others 



Zap is the approach to inference that is the focus of this series of papers. The basic 
idea in ZAP is that in realistic problems, many Boolean clauses are "versions" of a single 
clause. We will make this notion precise shortly; at this point, one might think of all of the 
instances of a quantified clause as being versions of any particular ground instance. The 
versions, it will turn out, correspond to permutations on the set of literals in the problem. 

As an example, suppose that we are tying to prove that it is impossible to put n + 1 
pigeons into n holes if each pigeon is to get its own hole. A Boolean axiomatization of this 
problem will include the axioms 



-ipii V 



^P31 



-.pi2 V 



■^P22 
■'P32 



-'Pin V -^P2n 
-^Pln V ^P3n 



-•Pll V -'Pn+l,! 
-'P21 V -1^31 



-ipi2 V ^Pn+1,2 
-ip22 V ->P32 



^Pln V -^Pn+l,n 
-'P2n V ^P3n 



-'P21 V -'Pn+1,1 -^P22 V -'Pn+1,2 



"'P2n V ^Pn+l,n 



^Pnl V ^Pn+1,1 ^Pn2 V ^Pn+1,2 



-^Pnn V -^Pn+l ,n 



where pij says that pigeon i is in hole j. Thus the first clause above says that pigeon one 
and pigeon two cannot both be in hole one. The second clause in the first column says that 
pigeon one and pigeon three cannot both be in hole one. The second column refers to hole 
two, and so on. It is fairly clear that all of these axioms can be reconstructed from the first 
by interchanging the pigeons and the holes, and it is this intuition that ZAP attempts to 
capture. 

What makes this approach interesting is the fact that instead of reasoning with a large 
set of clauses, it becomes possible to reason with a single clause instance and a set of 
permutations. As we will see, the sets of permutations that occur naturally are highly 
structured sets called groups, and exploiting this structure can lead to significant efficiency 
gains in both representation and reasoning. 

Some further comments on the above table: 
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• Unlike cardinality and pseudo-Boolean methods, which seem unlikely to achieve ex- 
ponential reductions in problem size in practice, and QPROP, which seems likely to 
achieve such reductions, ZAP is guaranteed, when the requisite structure is present, to 
replace a set of n axioms with a single axiom of size at most ■ulog(n), where v is the 
number of variables in the problem (Proposition 4.8). 

• The fundamental inference step in ZAP is in NP with respect to the ZAP representation, 
and therefore has complexity no worse than exponential in the representation size 
(i.e., polynomial in the number of Boolean axioms being resolved). In practice, the 
average case complexity appears to be low-order polynomial in the size of the zap 
representation (i.e., polynomial in the logarithm of the number of Boolean axioms 
being resolved) (Dixon et al., 2004a). 

• Zap obtains the savings attributable to subsearch in the QPROP case while casting 
them in a general setting that is equivalent to watched literals in the Boolean case. 

This particular observation is dependent on a variety of results from computational 
group theory and is discussed in the third paper in this series (Dixon et al., 2004a). 

• In addition to learning the Boolean consequences of resolution, ZAP continues to sup- 
port relevance-bounded learning schemes while also allowing the derivation of first- 
order consequences, conclusions based on parity arguments, and combinations thereof. 

In order to deliver on these claims, we begin in Section 2 by summarizing both the DPLL 
algorithm and the modifications that embody recent progress, casting DPLL into the precise 
form that is needed in ZAP and that seems to best capture the architecture of modern 
systems such as zChaff. Section 3 is also a summary of ideas that are not new with us, 
providing an introduction to some ideas from group theory. 

In Section 4, we describe the key insight underlying ZAP. As mentioned above, the 
structure exploited in earlier examples corresponds to the existence of particular groups of 
permutations of the literals in the problem. We call the combination of a clause and such 
a permutation group an augmented clause, and the efficiency of representation column 
of our table corresponds to the observation that augmented clauses can use group structure 
to improve the efficiency of their encoding. 

Section 5 (resolution) describes resolution in this broader setting, and Section 6 (proof 
length) presents a variety of examples of these ideas at work, showing that the pigeonhole 
problem, clique-coloring problems, and Tseitin's parity examples all admit short proofs 
in the new framework. Section 7 (learning method) recasts the DPLL algorithm in the 
new terms and discusses the continued applicability of relevance in our setting. Conclud- 
ing remarks are contained in Section 8. Implementation details, including a discussion 
of propagation techniques, are deferred until the third of this series of papers (Dixon 
et al., 2004a). This third paper will also include a presentation of performance details; at 
this point, we note merely that ZAP does indeed exhibit polynomial performance on the 
natural encodings of pigeonhole, parity and clique-coloring problems. This is in sharp con- 
trast with other methods, where theoretical best-case performance (let alone experimental 
average-case performance) is known to be exponential on these problems classes. 
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2. Boolean Satisfiability Engines 

In ZAPl, we presented descriptions of the standard Davis-Putnam-Logemann-Loveland 

(dpll) Boolean satisfiability algorithm and described informally the extensions to dpll 
that deal with learning. Our goal in this paper and the next is to describe an implementa- 
tion of our theoretical ideas. We therefore begin here by being more precise about dpll and 
its extension to relevance-bounded learning, or RBL. We present some general definitions 
that we will need throughout this paper, and then give a description of the DPLL algorithm 
in a learning/reason-maintenance setting. We prove that an implementation of these ideas 
can retain the soundness and completeness of dpll while using an amount of memory that 
grows polynomially with problem size. Although this result has been generally accepted 
since 1-relevance learning ("dynamic backtracking," Ginsberg, 1993) was generalized by 
Bayardo, Mirankcr and Schrag (Bayardo & Miranker, 1996; Bayardo &: Schrag, 1997), we 
know of no previous proof that RBL has the stated properties. 

Definition 2.1 A partial assignment is an ordered sequence 

{h, ■■■ ,ln) 

of distinct and consistent literals. 

Definition 2.2 Let Vili be a clause, which we will denote by c, and suppose that P is a 
partial assignment. We will say that the possible value of c under P is given by 

poss(c,P) = |{/ G c\^l - 1 

If no ambiguity is possible, we will write simply poss(c) instead o/poss(c, P). In other 
words, poss(c) is the number of literals that are either already satisfied or not valued by P, 
reduced by one (since true clauses require at least one true literal). 

If S is a set of clauses, we will write poss„(<S', P) for the subset of c E S for which 
poss(c, P) < n. 

In a similar way, we will define curr(c, P) to be 

curr(c,P) = |{/ G cnP}| - 1 

We will write curr„(S', P) for the subset of c E S for which curr(c, P) < n. 

Informally, if poss(c, P) = 0, that means that any partial assignment extending P can make 
at most one literal in c true; there is no room for any "extra" literals to be true. This might 
be because all of the literals in c are assigned values by P and only one such literal is true; 
it might be because there is a single unvalued literal and all of the other literals are false. 
If poss(c, P) < 0, it means that the given partial assignment cannot be extended in a way 
that will cause c to be satisfied. Thus we see that poss_i{S, P) is the set of clauses in S that 
are falsified by P. Since curr gives the "current" excess in the number of satisfied literals 
(as opposed to poss, which gives the possible excess), the set possq(6', P) D curr_i(6', P) 
is the set of clauses that are not currently satisfied and have at most one unvalued literal. 
These arc generally referred to as unit clauses. 

We note in passing that Definition 2.2 can easily be extended to deal with pseudo- 
Boolean instead of Boolean clauses, although that extension will not be our focus here. 



486 



ZAP 2: Theory 



Definition 2.3 An annotated partial assignment is an ordered sequence 



((/i,ci), . . . , (/„,c„)) 



of distinct and consistent literals and clauses, where Ci is the reason for literal li and either 
Ci = true (indicating that li was a branch point) or Ci is a clause such that: 

1. li is a literal in Ci, and 

2. poss(ci, (Zi, . . . = 

An annotated partial assignment will he called sound with respect to a set of clauses C if 
C \= Ci for each reason Ci. 

The reasons have the property that after the hterals li,...,li-i are all included in the 
partial assignment, it is possible to conclude k directly from since there is no other way 
for Ci to be satisfied. 

Definition 2.4 Let ci and C2 be clauses, each a set of literals to be disjoined. We will say 
that ci and C2 resolve if there is a unique literal I such that I G ci and G C2 . If Ci and 
C2 resolve, their resolvent , to be denoted resolve(ci, C2), is the disjunction of the literals in 
the set ci U C2 — -■/}. 

// ri and r2 are reasons, the result of resolving ri and r2 is defined to be: 



Definition 2.5 Let C be a set of clauses and P a partial assignment. A nogood for P 
is any clause c that is entailed by C but falsified by P. A nogood is any clause c that is 
entailed by C. 

We are now in a position to present one of the basic building blocks of dpll or rbl, 
the unit propagation procedure. This computes the "obvious" consequences of a partial 
assignment: 

Procedure 2.6 (Unit propagation) To compute Unit-Propagate(C, P) for a set C of 

clauses and an annotated partial assignment P: 




r2, i/ri=true; 
n, i/r2=true; 
the conventional resolvent of ri and r2, otherwise. 
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The above procedure returns a pair of values. The first indicates whether a contradiction has 
been found. If so, the second vahie is the reason for the failure, a consequence of the clausal 
database C that is falsified by the partial assignment P (i.e., a nogood). If no contradiction 
is found, the second value is a suitably extended partial assignment. Procedure 2.6 has also 
been modified to work with annotated partial assignments, and to annotate the new choices 
that are made when P is extended. 

Proposition 2.7 Suppose that C is a Boolean satisfiability problem, and P is a sound 
annotated partial assignment. Then: 

1. //'unit-propagate(P) = (false, P'), then P' is a sound extension of P, and 

2. //unit-propagate(P) = (true, c), then c is a nogood for P. 

Proof. In the interests of maintaining the continuity of the exposition, most proofs (in- 
cluding this one!) have been deferred to the appendix. Proofs or proof sketches will appear 
in the main text only when they further the exposition in some way. □ 
We can now describe relevance-bounded learning: 

Procedure 2.8 (Relevance-bounded reasoning, rbl) Let C he a SAT problem, and D 
a set of learned nogoods, so that C \= D. Let P be an annotated partial assignment, and k 
a fixed relevance bound. To compute RBh{C, D, P): 

1 {x, y) ^ Unit-Propagate(C U D, P) 

2 if a; = true 



3 then c ^ y 

4 if c is empty 

5 then return failure 

6 else remove successive elements from P so that c is satisfiable and 

poss(c, P) < k 

7 D^{c}Uposs^{D,P) 

8 return rbl(C, D, P) 

9 else P <— y 

10 if P is a solution to C 

11 then return P 

12 else I <— a literal not assigned a value by P 

13 return rbl(C, D, (P, (/, true))) 



This procedure is fairly different from the original description of DPLL, so let us go 
through it. 

In general, variables are assigned values either via branching on line 13, or unit propa- 
gation on lines 1 and 9. If unit propagation terminates without reaching a contradiction or 
finding a solution (so that x is false on line 2 and the test on line 10 fails as well), then a 
branch variable is selected and assigned a value, and the procedure recurs. 

If the unit propagation procedure "fails" and returns (true, c) for a new nogood c, 
the new clause is learned by adding it to D, and the search backtracks at least to the 
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point where c is satisfiable.^ Any learned clauses that have become irrelevant (in that 
their poss value exceeds the irrelevance cutoff k) are removed. Note that we only remove 
learned irrelevant nogoods; we obviously cannot remove clauses that were part of the original 
problem specification. It is for this reason that the sets C and D (of original and learned 
clauses respectively) are maintained separately. 

Procedure 2.8 can fail only if a contradiction (an empty clause c) is derived. In all other 
cases, progress is made by augmenting the set of clauses to include at least one new nogood 
that eliminates the current partial assignment. Instead of resetting the branch literal I to 
take the opposite value as in Davis et.al.'s original description of their algorithm, a new 
clause is learned and added to the problem. This new clause causes either I or some previous 
variable to take a new value. 

The above description is ambiguous about a variety of points. We do not specify how 
far to backtrack on line 6, or the branch literal to be chosen on line 12. We will not be 
concerned with these choices; ZAP takes the same approach that zChaff does and the 
implementation is straightforward. 

Theorem 2.9 Rbl is sound and complete in that rbl(C, 0, ()) will always return a satis- 
fying assignment if C is satisfiable and will always report failure if C is unsatisfiahle. Rbl 
also uses an amount of memory polynomial in the size of C ( although exponential in the 
relevance bound k). 

As discussed at some length in ZAPl, other authors have focused on extending the lan- 
guage of Boolean satisfiability in ways that preserve the efficiency of rbl. These extensions 
include the ability to deal with quantification (Ginsberg Sz Parkes, 2000) , pseudo- Boolean or 
cardinality clauses (Barth, 1995, 1996; Chandru & Hooker, 1999; Dixon & Ginsberg, 2000; 
Hooker, 1988; Nemhauser & Wolsey, 1988), or parity clauses (Baumgartner & Massacci, 
2000; Li, 2000). 

3. Some Concepts from Group Theory 

Relevance-bounded learning is only a part of the background that we will need to describe 
ZAP; we also need some basic ideas from group theory. There are many excellent references 
available on this topic (Rotman, 1994, and others), and we can only give a brief account 
here. Our goal is not to present a terse sequence of definitions and to then hollowly claim 
that this paper is self-contained; we would rather provide insight regarding the goals and 
underlying philosophy of group theory generally. We will face a similar problem in the final 
paper in this series, where we will draw heavily on results from computational group theory 
and will, once again, present a compact and hopefully helpful overview of a broad area of 
mathematics. 

Definition 3.1 A group is a set S equipped with an associative binary operator o. The 
operator o has an identity 1, with lox = xol = x for all x € S, and an inverse, so that 
for every x E S there is an x"^ such that x o x~^ = x~^ o x = 1. 

3. As we remarked in ZAPl, some systems such as zChaff (Moskewicz et aL, 2001) backtrack further to 
the point where c is unit. 
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In other words, o is a function o : S x S ^ S; since o is associative, we always have 

{x o y) o z = X o (^y o z) 

The group operator o is not required to be commutative; if it is commutative, the group is 

called Abelian. 

Typical examples of groups include Z, the group of integers with the operation being 
addition. Similarly, Q and M arc the groups of rationals or reals under addition. For 
multiplication, zero needs to be excluded, since it has no inverse, and we get the groups Q* 
and M*. Note that Z* is not a group, since 1/n is not an integer for most integers n. 

Other common groups include for any positive integer n; this is the group of integers 
mod n, where the group operation is addition mod n. For a prime p, the set Z* of nonzero 
integers mod p does have a multiplicative inverse, so that Z* is a group under multiplication. 
The group Zi contains the single element and is the trivial group. This group is often 
denoted by 1. 

All of the groups we have described thus far are Abelian, but non-Abelian groups are 
not hard to come by. As an example, the set of all n x n matrices with real entries and 
nonzero determinants is a group under multiplication, since every matrix with a nonzero 
determinant has an inverse. This group is called the general linear group and is denoted 
GL{n). 

Of particular interest to us will be the so-called permutation groups: 

Definition 3.2 Let T be a set. A permutation of T is a bijection u : T ^ T from T to 
itself. 

Proposition 3.3 Let T be a set. Then the set of permutations of T is a group under the 
composition operator. 

Proof. This is simply a matter of validating the definition. Functional composition is well 
known to be associative (although not necessarily commutative), and the identity function 
from T to itself is the identity for the composition operator. Since each permutation is a 
bijection, permutations can be inverted to give the inverse operator for the group. □ 

The group of permutations on T is called the symmetry group of T, and is typically 
denoted Sym(T). We will take the view that the composition fog acts with / first and g 
second, so that (/ o g){x) = g{f{x)) for any x G T. (Note the variable order.) 

Because permutation groups will be of such interest to us, it is worthwhile to introduce 
some additional notation for dealing with them in the case where T C Z is a subset of 
the integers. In the special case where T = {1, . . . ,n}, we will often abbreviate Sym(T) to 
either Sym(n) or simply Sn- 

Of course, we can get groups of permutations without including every permutation on a 
particular set; the 2-element set consisting of the identity permutation and the permutation 
that swaps two specific elements of T is closed under inversion and composition and is 
therefore a group as well. In general, we have: 

Definition 3.4 Let (G, o) be a group. Then a subset H C G is called a subgroup if {H, o) 
is a group. This is denoted H < G. If the inclusion is proper, we write H < G. 
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A subgroup of a group G is any subset of G that includes the identity and that is closed 
under composition and inversion. 

If G is a finite group, closure under composition suffices. To understand why, suppose 
that we have some subset H C G that is closed under composition, and that x e H. Now 

G H, and G H, and so on. Since G is finite, we must eventually have x"^ = for some 
integers m and n, where we assume m > n. It follows that x"*~" = 1 so x"^""^-*- = x~^, so 
H is closed under inversion and therefore a subgroup. 

Proposition 3.5 Let G be a group, and suppose that Hi < G and H2 < G are subgroups. 
Then Hi n H2 < G is a subgroup of G as well. □ 

This should be clear, since Hi fl H2 must be closed under inversion and composition if 
each of the component groups is. 

Definition 3.6 Let G be a group, and S C. G a subset. Then there is a unique smallest 
subgroup of G containing S, which is denoted (S) and called the subgroup of G generated 
by S. The order of an element g E G is defined to be \{g)\. 

The generated subgroup is unique because it can be formed by taking the intersection of 
all subgroups containing S. This intersection is itself a subgroup by virtue of Proposition 3.5. 
1{ S = 01 S = {1}, the trivial subgroup is generated, consisting of only the identity element 
of G. Thus the order of the identity element is one. For any element g, the order of g is 
the least nonzero exponent m for which g"^ = 1. 

We have already remarked that the two-element set {1, (ah)} is a group, where 1 repre- 
sents the trivial permutation and (06) is the permutation that swaps a and b. It is easy to 
see that {1, {ab}} is the group generated by (a6). The order of (a6) is two. 

In a similar way, if (abed) is the permutation that maps a to 6, 6 to c, cto d and d back 
to a, then the subgroup generated by (abed) is 

{1, (abed), (ac) o (bd), (adcb)} 

The third element simultaneously swaps a and c, and swaps b and d. The order of the 
permutation (abed) is four, and (abed) is called a 4-cycle. Both this subgroup and the 
previous subgroup generated by {a,b) are Abelian, although the full permutation group 
Sym(L) is not Abelian if \L\ > 2. It is not hard to see that (p) is Abelian for any specific 
permutation p. 

Slightly more interesting is the group generated by (abed) together with (ac) . This group 
has eight elements: 

{1, (abed), (ac) o (bd), (adcb), (ac), (ad) o (be), (bd), (ab) o (cd)} (1) 

The first four permutations don't "use" (ac) and the second four do. Since (abed) o (ac) ^ 
(ac) o (abed), this group is not Abelian. 

It is not hard to see that the group (1) is in fact the group of rigid motions of a square 
whose vertices are labeled a, b, c and d. The first permutation (abed) corresponds to a 
rotation of the square by 90° and the second (ac), to a flip around the b-d diagonal. The 
first four permutations in (1) simply rotate the square, while the second four use the flip as 
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well; the group is not Abelian because a flip followed by a 90° clockwise rotation is different 
than the rotation followed by the flip. In a similar way, the six basic twists of Rubik's 
cube generate a permutation group of size approximately 4.3 x 10^^, giving all accessible 
permutations of the faces. 

In general, suppose that / is a permutation on a set S and x € S. We can obviously 
consider the image of x under /. Rather than denote this image by f{x) as usual, it is 
customary to denote it by x-^ . The reason for this "inline" notation is that we now have 

which seems natural, as opposed to the unnatural 

{fg){x)=g{f{x)) 

as mentioned previously. We have dropped the explicit composition operator o here. 

Continuing, we can form the set of images x^ where / varies over all elements of some 
permutation group G. This is called the orbit of x under G: 

Definition 3.7 Let G < Sym(T) be a permutation group. Then for any x E T, the orbit 
of X in G, to be denoted by x^ , is given by x^ = {x^lg € G}. 

Returning to the case of permutations on integers, suppose that n is an integer and to 
a permutation. We can consider (iv), the group of permutations generated by cu, which is 
the set of powers of uo until we eventually have oj™ = 1 for m the order of uo. The orbit of 
n under (a;) is the set n^'^'^ = {n, n'^, n'^ , . . . , n'^"' }. 

Now suppose that n' is some other integer that appears in this sequence, say n' = n'^ . 
Now n'^ = {n^ = n"^ ^\ so that the images of n' can be "read off" from the sequence 
of images of n. It therefore makes sense to write this "piece" of the permutation as (for 
example) 

(1,3,4) (2) 

indicating that 1 is mapped to 3, that 3 is mapped to 4, and that 4 is mapped back to 1. 

Of course, the 3-cycle (2) doesn't tell us what happens to integers that are not in n^"^^; 
for them, we need another cycle as in (2). So if the permutation uj swaps 2 and 5 in addition 
to mapping 1 to 3 and so on, we might write 

a; = (1,3, 4) (2, 5) (3) 

If 6 is not moved by uj (so that 6'^ = 6), we could write 

a; = (l,3,4)(2,5)(6) (4) 

In general, we will not mention variables that are fixed by the permutation, preferring (3) to 
the longer (4). We can often omit the commas within the cycles, so that we will continue to 
abbreviate (a, 6, c) as simply {ahc). If we need to indicate explicitly that two cycles are part 
of a single permutation, we will introduce an extra set of parentheses, perhaps rewriting (3) 
as 

c.; = ((l,3,4)(2,5)) 
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Every permutation can be written as a product of disjoint cycles in this way. 

Finally, in composing permutations written in this fashion, we will either drop the o or 
replace it with •, so that we have, for example, 

(abc) ■ {abd) = {ad){bc) 

The point a is moved to b by the first cycle and then to d by the second. The point b is 
moved to c by the first cycle and then not changed by the second; c is moved to a and then 
on to b. Finally, d is not moved by the first cycle but is moved to a by the second. 
Two other notions that we will need are that of closure and stabilizer: 

Definition 3.8 Let G < Sym(r), and S CT. By the G-closure of S, to be denoted , 
we will mean the set 

= {s^\s eS andgeC} 



Definition 3.9 Given a group G < Sym(T) and L Q T, the pointwise stabilizer of L, 
denoted Gl, is the subgroup of all g ^ G such that W = I for every I G L. The set stabilizer 
of L, denoted is that subgroup of all g ^ G such that = L. 

As an example, consider the group G generated by the permutation a; = (1,3, 4) (2, 5) 
that we considered above. Since oj^ = (1,4, 3) and t<j^ = (2, 5), it is not too hard to see that 
G = ((1, 4, 3), (2, 5)) is the group generated by the 3-cycle (1,4,3) and the 2-cycle (2,5). 
The subgroup of G that point stabilizes the set {2} is thus G2 = ((1,4,3)), and 6^2,5 is 
identical. The subgroup of G that set stabilizes {2, 5} is ^{2,5} = G, however, since every 
permutation in G leaves the set {2, 5} intact. 

4. Axiom Structure as a Group 

While we will need the details of Procedures 2.8 and 2.6 in order to implement our ideas, 
the procedures themselves inherit certain weaknesses of DPLL as originally described. Two 
weaknesses that we hope to address are: 

1. The appearance of posso(C, P) fl curr_i(C, P) in the inner unit propagation loop 
requires an examination of a significant subset of the clausal database at each inference 
step, and 

2. Both DPLL and RBL are fundamentally resolution-based methods; there are known 
problem classes that are exponentially difficult for resolution-based methods but which 
are easy if the language in use is extended to include either cardinality or parity 
clauses. 

4.1 Examples of Structure 

Let us begin by examining examples where specialized techniques can help to address these 
difficulties. 
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4.1.1 SUBSEARCH 

As we have discussed elsewhere (Dixon et al., 2004b; Ginsberg k, Parkes, 2000), the set of 

axioms that need to be investigated in the DPLL inner loop often has structure that can be 
exploited to speed the examination process. If a ground axiomatization is replaced with a 
lifted one, the search for axioms with specific syntactic properties is NP-complete in the 
number of variables in the lifted axiom, and is called subsearch for that reason. 

In many cases, search techniques can be applied to the subsearch problem. As an 
example, suppose that we are looking for instances of the lifted axiom 



that are unit, so that poss(i,P) = and curr(i,P) = — 1 for some such instance i and a 
unit propagation is possible clS Bj result. 

Our notation here is that of QPROP. There is an implicit universal quantification over 
X, y and z, and each quantification is over a domain of finite size. We assume that all of 
the domains are of size d, so (5) corresponds to ground axioms. If a(x, y) is true for all x 
and y (which we can surely conclude in time 0{(P)), then we can conclude without further 
work that (5) has no unit instances, since every instance of (5) is already satisfied. If a{x, y) 
is true except for a single (x, y) pair, then we need only examine the d possible values of z 
for unit instances, reducing our total work from d^ to + d. 

It will be useful in what follows to make this example still more specific, so let us assume 
that y and z arc all chosen from a two clement domain {A, B}. The single lifted axiom (5) 
now corresponds to the set of ground instances: 



If we introduce ground literals l\, l2,h, h for the instances of a{x, y) and so on, we get: 



a{x,y)\/ b{y,z) \/c{x,z) 



(5) 



a{A, A) V b{A, A) 
a{A, A) y b{A, B) 
a{A,B)Vb{B,A) 
a{A,B)\/ b{B,B) 
a{B,A)\/b{A,A) 
a{B,A)\/b{A,B) 
a{B,B)\/b{B,A) 
a{B,B)Vb{B,B) 



V c{A,A) 
\/c{A,B) 

V c{A,A) 
Vc{A,B) 
Vc{B,A) 
\/c{B,B) 
yciB,A) 
yc{B,B) 



/l V /s V Ig 

/i V /e V ho 

hyhy h 

/2 V /g V ho 

k^ky hi 

k V ^6 V h2 

/4 V /r V hi 

^4 V /s V /l2 



(6) 
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at which point the structure imphcit in (5) has been obscured. We will return to the details 
of this example shortly. 

4.1.2 Cardinality 

Structure is also present in the sets of axioms used to encode the pigeonhole problem, which 
is known to be exponentially difficult for any resolution-based method (HaJcen, 1985). As 

shown by a variety of authors (Cook, Coullard, & Turan, 1987; Dixon & Ginsberg, 2000), 
the pigeonhole problem can be solved in polynomial time if we extend our representation 
to include cardinality axioms such as 

Xi-\ \-Xm> k (7) 

As shown in ZAPl, the single axiom (7) is equivalent to (^'"j^) conventional disjunctions. 

As in Section 4.1.1, we will consider this example in detail. Suppose that we have the 
clause 

Xi + X2 + X3 + X4 + X5 > 3 (8) 

saying that at least 3 of the Xj's are true. This is equivalent to 

xi V a;2 V X3 xi V 0:4 V X5 

xi V a;2 V X4 X2 V ^3 V X4 

xi V X2 V X5 X2 V X3 V X5 (9) 

xi V X3 V X4 X2 V X4 V X5 

xi V X3 V X5 X3 V X4 V X5 



4.1.3 Parity Clauses 

Finally, we consider clauses that are most naturally expressed using modular arithmetic or 
exclusive or's, such as 

xi-\ h Xfe = (mod 2) (10) 

or 

xi^ hXfc = l(mod2) (11) 

The parity of the sum of the .Xj's is specified as even in (10) or as odd in (11). 

It is well known that axiom sets consisting of parity clauses in isolation can be solved 
in polynomial time using Gaussian elimination, but there are examples that are exponen- 
tially difficult for resolution-based methods (Tseitin, 1970). As in the other examples we 
have discussed, single axioms such as (11) reveal structure that a straightforward Boolean 
axiomatization obscures. In this case, the single axiom (11) with k = 3 is equivalent to: 

Xi V X2 V X3 

Xi V ^X2 V ^X3 (12) 
-1X1 V X2 V -1X3 

-1X1 V -1X2 V X3 

As the cardinality axiom (7) is equivalent to [jj^i) disjunctions, a parity axiom of the form 
of (10) or (11) is in general equivalent to 2^~^ Boolean disjunctions. 
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4.2 Formalizing Structure 

Of course, the ground axiomatizations (6), (9) and (12) are equivalent to the original descrip- 
tions given by (5), (7) and (11), so that any structure present in these original descriptions 
is still there. That structure has, however, been obscured by the ground encodings. Our 
goal in this section is to begin the process of understanding the structure in a way that lets 
us describe it in general terms. 

As a start, note that each of the axiom sets consists of axioms of equal length; it follows 
that the axioms can all be obtained from a single one simply by permuting the literals in 
the theory. In (6) and (9), literals are permuted with other literals of the same sign; in (12), 
literals are permuted with their negated versions. But in every instance, a permutation 
suffices. 

Thus, for example, the set of permutations needed to generate (9) from the first ground 
axiom alone is clearly just the set 

J7 = Sym({xi,X2,X3,2:4,X5}) (13) 

since these literals can be permuted arbitrarily to move from one element of (9) to another. 
The set Q in (13) is a subgroup of the full permutation group S2n on 2n literals in n 
variables, since O, is easily seen to be closed under inversion and composition. 

What about the example (12) involving a parity clause? Here the set of permutations 
needed to generate the four axioms from the first is given by: 

(xi,^a;i)(.T2,-'X2) (14) 
{xi,^Xi){x3,^X3) (15) 
(X2,^X2)(X3,^X3) (16) 

Literals are now being exchanged with their negations, but this set, too, is closed under the 
group inverse and composition operations. Since each element is a composition of disjoint 
transpositions, each element is its own inverse. The composition of the first two elements 
is the third. 

The remaining example (6) is a bit more subtle; perhaps this is to be expected, since the 
axiomatization (6) obscures the underlying structure far more effectively than does either 
(9) or (12). 

To understand this example, note that the set of axioms (6) is "generated" by a set 
of transformations on the underlying variables. In one transformation, we swap the values 
of A and B for x while leaving the values for y and z unchanged, corresponding to the 
permutation 

(a(A A),a{B, A)){aiA, B),a{B, B)){c{A, A),c{B, A)){c{A, B),c{B, B)) 

We have included in a single permutation the induced changes to all of the relevant ground 
literals. (The relation h doesn't appear because h does not have x as an argument in (5).) 
In terms of the literals in (6), this becomes 

^^x = {hk)il2U){l9hi){lioli2) 
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In a similar way, swapping the two values for y corresponds to the permutation 

f^y = {hh){hU){hh){kk) 

and z produces 

= ikk){hk){kho){hih2) 

Now consider Q = {ujx,^^y,i^z), the subgroup of Sym({/i, . . . ,/i2}) that is generated by 
LOx, ojy and ujz- Since the clauses in (6) can be obtained from any single clause by permuting 
the values of y and it is clear that the image of any single clause in the set (6) under 
$7 is exactly the complete set of clauses (6). 

As an example, operating on the first axiom in (6) with u)x produces 

^3 V /s V /ii 

This is the fifth axiom, as it should be, since we have swapped a{A,A) with a{B,A) and 
c{A,A) with c{B,A). 

Alternatively, a straightforward calculation shows that 

UxUJy = {hU){l2k){hh){kl8){l9lu){hoh2) 

and maps the first axiom in (9) to the next-to-last, the second axiom to last, and so on. 

It should be clear at this point what all of these examples have in common. In every 
case, the set of ground instances corresponding to a single non-Boolean axiom can be 
generated from any single ground instance by the elements of a subgroup of the group 
S2n of permutations of the literals in the problem. 

Provided that all of the clauses are the same length, there is obviously some subsei (as 
opposed to subgroup) of S2n that can produce all of the clauses from a single one. But 
subgroups are highly structured objects; there arc many fewer subgroups of S2n than there 
are subsets.^ One would not expect, a priori, that the particular sets of permutations arising 
in our examples would all have the structure of subgroups. The fact that they do, that all 
of these particular subsets are subgroups even though so few subsets are in general, is what 
leads to our general belief that the structure of the subgroups captures and generalizes the 
general idea of structure underlying our motivating examples. 

In problems without structure, the subgroup property is absent. An instance of random 
3-SAT, for example, can always be encoded using a single 3-literal clause c and then that set 
of permutations needed to recover the entire problem from c in isolation. There is no struc- 
ture to the set of pernuitations because the original set of clauses was itself unstructured. 
In the examples we have been considering, on the other hand, the structure is implicit in 
the requirement that the set J7 used to produce the clauses be a group. As we will see, this 
group structure also has just the computational properties needed if we are to lift RBL and 
other Boolean satisfiability techniques to our broader setting. 

Let us also point out the surprising fact that the subgroup idea captures all of the 
structures discussed in ZAPl. It is not surprising that the various structures used to reduce 
proof size all have a similar flavor, or that the structure used to speed unit propagation be 

4. In general, S2n has 2^^"^' subsets, of which only approximately 2" are subgroups (Pyber, 1993). 
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uniform. But it strikes us as remarkable that these two types of structure, used for such 
different purposes, are in fact instances of a single framework. 

This, then, is the technical insight on which zap rests: Instead of generalizing the 
language of Boolean satisfiability as seems required by the range of examples we have 
considered, it suffices to annotate ground clauses with the CI needed to reproduce a larger 
axiom set. Before we formalize this, however, note that any "reasonable" permutation that 
maps a literal h to another literal I2 should respect the semantics of the axiomatization and 
map ^Zi to -1/2 as well. 

Definition 4.1 Given a set of n variables, we will denote by Wn that subgroup of S2n 
consisting of permutations that map the literal to -^h whenever they map li to h- 

Informally, an element of Wn corresponds to a permutation of the n variables, together with 
a choice to flip some subset of them; Wn is therefore of size \Wn\ = 2"n!.^ 
We are now in a position to state: 

Definition 4.2 An augmented clause in an n-variable Boolean satisfiability problem is a 
pair (c, G) where c is a Boolean clause and G < Wn ■ A ground clause d is an instance of 
an augmented clause (c, G) if there is some g & G such that d = . The clause c will he 
called the base instance of (c, G) . 

Our aim in the remainder of this paper is to show that augmented clauses have the 
properties needed to justify the claims we made in the introduction: 

1. They can be represented compactly, 

2. They can be combined efficiently using a generalization of resolution, 

3. They generalize existing concepts such as quantification over finite do- 
mains, cardinality, and parity clauses, together with providing natural 
generalizations for proof techniques involving such clauses, 

4. Rbl can be extended with little or no computational overhead to manipu- 
late augmented clauses instead of ground ones, and 

5. Propagation can be computed efficiently in this generalized setting. 

The first four points will be discussed in this and the next three sections of the paper. The 
final point is presented in the next paper in this series. 

4.3 Eflftciency of Representation 

For the first point, the fact that the augmentations G can be represented compactly is a 
consequence of G's group structure. In the example surrounding the reconstruction of (9) 
from (13), for example, the group in question is the full symmetry group on m elements, 
where m is the number of variables in the cardinality clause. In the lifting example (12), 

5. We note in passing that Wn is the so-called wreath product of 52 and 5n, typically denoted S^lSn- The 
specific group Wn is also called the group of "permutations and complementations" by Harrison (1989). 
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we can describe the group in terms of the generators and uJz instead of hsting all 

eight elements that the group contains. In general, we have (recall that proofs appear in 
the appendix): 

Proposition 4.3 Let S be a set of ground clauses, and (c, G) an equivalent augmented 
clause, where G is represented by generators. It is possible in polynomial time to find a set 
of generators {coi, . . . ,cok} where k < log2 \G\ and G = {loi, . . . ,LOk). 

Since the size of the full permutation group Sn is only n! < n" and a single generator 
takes at most 0(n) space, we have: 

Corollary 4.4 Any augmented clause in a theory containing n literals can he expressed in 
0{n^ log2 n) space. □ 

This result can be strengthened using: 

Proposition 4.5 (Jerrum, 1986; Knuth, 1991) Let G < Sn- It is possible to find in 
polynomial time a set of generators for G of size at most 0{n). □ 

This reduces the 0(n^ log2 n) in the corollary to simply 0(n^).^ 

Before proceeding, let us make a remark regarding computational complexity. All of 
the group-theoretic constructs of interest to us can be computed in time polynomial in 
the group size; basically one simply enumerates the group and evaluates the construction 
(generate and test, as it were). What is interesting is the collection of group constructions 
that can be computed in time polynomial in the number of generators of the group and the 
number of variables in the problem. Given Proposition 4.5, the time is thus polynomial in 
the number of variables in the problem. 

Note that the size of the group G can be vastly greater than the number of instances of 
any particular augmented clause (c, G) . As an example, for the cardinality clause 

xi H \-Xm>k (17) 

the associated symmetry group Symjzi, . . . , Xm} acts on an instance such as 

Xi V • • • V Xm-k+l (18) 

to reproduce the full Boolean axiomatization. But each such instance corresponds to 

(m — + 1)! distinct group elements as the variables within the clause (18) are permuted. 

In this particular case, the symmetry group Symjrci, . . . ,Xm} can in fact be generated 
by the two permutations (xi, X2) and (x2, X3, . . . , Xm). 

Definition 4.6 Two augmented clauses (ci,Gi) and (c2,G2) will be called equivalent if 

they have identical sets of instances. This will he denoted (ci,Gi) = (c2,G2)- 

6. Although the methods used are nonconstructive, Babai (1986) showed that the length of an increasing 
sequence of subgroups of 5„ is at most [^J — 2; this imposes the same bound on the number of 

generators needed (compare the proof of Proposition 4.3). Using other methods, McGiver and Neumann 
stated (1987) that for n 7^ 3, there is always a generating set of size at most [/^\. 
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Proposition 4.7 Let {c,G) be an augmented clause. Then if c' is any instance of{c,G), 
(c,G) = (c',G). 

We also have: 

Proposition 4.8 Let (c, G) be an augmented clause with d distinct instances. Then there 
is a subgroup H <G that can be described using 0(log2(cZ)) generators such that {c,H) = 
(c, G) . Furthermore, given d and generators for G, there is a Monte Carlo polynomial-time 
algorithm for constructing the generators of such an H.^ 

Proposition 4.5 is the first of the results promised in the introduction: If d Boolean 
axioms involving n variables can be captured as instances of an augmented clause, that 
augmented clause can be represented using 0(n) generators; Proposition 4.8 guarantees 
that 0(log2 d) generators suffice as well. 

In the specific instances that we have discussed, the representational efficiencies are 
greater still: 



clause 


Boolean 




total 


type 


axioms 


generators 


size 


cardinality 


\k-l) 


2 


m + 1 


parity 




3 


k + 5 


QPROP 




2v 


v{d + l) 



Each row gives the number of Boolean axioms or generators needed to represent a clause 
of the given type, along with the total size of those generators. For the cardinality clause 
(17), the complete symmetry group over m variables can be expressed using exactly two 
generators, one of size 2 and the other of size m — 1.^ The number of Boolean axioms is 
(j^.^^) as explained in Section 4.1.2. 

For the parity clause 

xi + ■ ■ ■ + Xk = m (mod 2) 

the number of Boolean axioms is the same as the number of ways to select an even number 
of the Xj's, which is half of all of the subsets of {xi, . . . ,Xk}. (Remove xi from the set; 
now any subset of the remaining Xi can be made of even parity by including xi or not 
as appropriate.) The parity groups can be captured by A; — 1 generators of the form 
(xi, -'Xi), {xi, ^Xi) as i = 2, . . . ,k (total size 4(A; — 1)); alternatively, one can combine the 
single generator (xi, -'Xi)(x2, -'X2) with the full symmetry group on xi, . . . , x^ to describe 
a parity clause using exactly three generators (total size 4 + 2 + A; — 1). 

Finally, a QPROP clause involving v variables, each with a domain of size d, corresponds 
to a set of d'" individual domain axioms. As we saw in Section 4.2 and will formalize in 
Section 6.1, the associated group can be described using symmetry groups over the domains 
of each quantified variable; there are v such groups and two generators (of size 2 and d—1) 
are required for each.^ 

7. A Monte Carlo algorithm is one that is not deterministic but that can be made to work with arbitrarily 
high specified probability without changing its overall complexity (Seress, 2003). 

8. As noted earlier, Sn is generated by the transposition (1, 2) and the n — 1-cycle (2, 3, ... , n). 

9. Depending on the sizes, the number of generators needed for a product of symmetry groups can be 
reduced in many cases, although the total size is unchanged. 
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Note that the total sizes are virtually optimal in all of these cases. For cardinality and 

parity clauses, it is surely essential to enumerate the variables in question (size m and k 
respectively). For qprop clauses, simply enumerating the domains of quantification takes 
space vd. 

5. Resolution 

We now turn to the question of basing derivations on augmented clauses instead of ground 
ones. We begin with a few preliminaries: 

Proposition 5.1 For ground clauses c\ and C2 and a permutation to G Wn, 

resolve(ti;(ci), u;(c2)) = ti;(resolve(ci, C2)) 



Definition 5.2 If C is a set of augmented clauses, we will say that C entails an augmented 
clause (c, G), writing C \= (c, G), if every instance of (c, G) is entailed by the set of instances 
of the augmented clauses in C . 

We are now in a position to consider lifting the idea of resolution to our setting, but let 
us first discuss the overall intent of this lifting. What we would like to do is to think of an 
augmented clause as having force similar to all of its instances; as a result, when we resolve 
two augmented clauses (ci,Gi) and (c2,G2), we would like to obtain as the (augmented) 
resolvent the set of all resolutions that are sanctioned by resolving an instance of (ci , G\ ) 
with one of (02,^2)- Unfortunately, we have: 

Proposition 5.3 There are augmented clauses c\ and ci such that the set S of resolvents 
of instances of the two clauses does not correspond to any single augmented clause (c, G) . 

Given that we cannot capture exactly the set of possible resolvents of two augmented 
clauses, what can we do? If {c,G) is an "augmented resolvent" of (ci,Gi) and (c2,G2)) we 
might expect (c, G) to have the following properties: 

1. It should be sound, in that (ci,Gi) A (£2,^2) |= (c, G). This says that every instance 
of (c, G) is indeed sanctioned by resolving an instance of (ci, Gi) with an instance of 

(C2,G2). 

2. It should be complete, in that resolve(ci, C2) is an instance of (c, G). We may not be 
able to capture every possible resolvent in the augmented clause (c, G), but we should 
surely capture the base case that is obtained by resolving the base instance ci directly 
against the base instance C2. 

3. It should be monotonic, in that if Gi < Hi and G2 < H2, then (c, G) is also a resolvent 
of {ci,Hi) and (c2, i^2)- As the clauses being resolved become stronger, the resolvent 
should become stronger as well. 

4. It should be polynomial, in that it is possible to confirm that (c, G) is a resolvent of 
(ci,Gi) and (c2,G2) in polynomial time. 
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5. It should be stable, in that if cf = C2 , then (resolve(ci, C2), G) is a resolvent of 
(ci,G) and (c2,G). Roughly speaking, this says that if the groups in the two input 
clauses are the same, then the augmented resolvent can be obtained by resolving the 
base instances ci and C2 and then operating with the same group. 

6. It should be strong, in that if no element of ^ is moved by G2 and similarly no element 
of c^^ is moved by Gi, then (resolve(ci, C2), (Gi,G2)) is a resolvent of (ci,Gi) and 
(c2, G2). This says that if the group actions are distinct in that Gi acts on ci and leaves 
C2 completely alone and vice versa, then we should be able to get the complete group of 
resolvents in our answer. This group corresponds to be base resolvent resolve(ci, C2) 
acted on by the group generated by Gi and G2 . 

Definition 5.4 A definition of augmented resolution will be called satisfactory if it satisfies 
the above conditions. 

Note that wc do not require that the definition of augmented resolution be unique. Our 
goal is to define conditions under which (c, G) is an augmented resolvent of (ci , Gi ) and 
(c2,G2), as opposed to "the" augmented resolvent of (ci,Gi) and (c2,G2). To the best of 
our knowledge (and as suggested by Proposition 5.3), there is no satisfactory definition of 
augmented resolution that defines the resolvent of two augmented clauses uniquely. 

As we work toward a satisfactory definition of augmented resolution, let us consider some 
examples to help understand what the basic issues are. Consider, for example, resolving 
the two clauses 

(aV6,((6c))) 
which has instances a V 6 and a V c and 

which has the single instance -la V d. We will write these somewhat more compactly as 

(a V b, {be)) (19) 

and 

(-.a V d, 1) (20) 

respectively. 

Resolving the clauses individually, we see that we should be able to derive the pair of 
clauses by d and c V d; in other words, the augmented clause 

(6 V d, {be)) (21) 

It certainly seems as if it should be possible to capture this in our setting, since the base 
instance of (21) is just the resolvent of the base instances of (19) and (20).^'^ Where does 
the group generated by (6c) come from? 

To answer this, we begin by introducing some additional notation. 

10. Indeed, condition (6) requires that & V d be an instance of some augmented resolvent, since the groups 
act independently in this case. 
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Definition 5.5 Let u he a permutation and S a set. Then by io\s we will denote the result 
of restricting the permutation to the given set. 

Note that u!\s will be a permutation on S if and only if S"^ = S, so that S is fixed by to. 

Definition 5.6 For Ki, . . . , C L and Gi, . . . , Gn < Sym(L), we will say that a permu- 
tation to € Sym(L) is an extension of {Gi, . . . , Gn} for {Ki, . . . , Kn} if there are gi £ Gi 
such that for all i, uj\Ki = gi\Ki- We will denote the set of such extensions by extii{Ki, Gi). 

The definition says that any particular extension x G extn{Ki,Gi) must simultaneously 
extend elements of all of the individual groups Gi, when those groups act on the various 

subsets Ki. 

As an example, suppose that Ki = {a, b} and K2 = {-'a, e}, with Gi = Sym{6, c, d} and 
G2 = {{ed)). A permutation is an extension of the Gi for Ki if and only if, when restricted 
to {a,b} it is a member of Sjm.{b,c,d) and, when restricted to {a, e} it is a member of 
{{ed)). In other words, b can be mapped to b, c or d, and e can be mapped to d if desired. 
The set of extensions is thus 

{(), (6c), {bed), {bdc), {cd), {ed), {edc), {bc){de)} (22) 

Note that this set is not a group because it is not closed under the group operations; {edc) 
is permitted (e is mapped to d and we don't care where d and c go), but (edc)^ = {ecd) is 
not. 

Considered in the context of resolution, suppose that we are trying to resolve aug- 
mented clauses (ci,Gi) and (c2,G2). At some level, any result (resolve(ci, C2), G) for 
which G C extn(cj,Gi) should be a legal resolvent of the original clauses, since each in- 
stance is sanctioned by resolving instances of the originals. We can't define the resolvent 
to be (resolve(ci, C2), extn(ci, Gj)) because, as wc have seen from our example, neither 
extn(ci,Gi) nor extn(cj,Gj) n Wn need be a group. (We also know this from Proposi- 
tion 5.3; there may be no single group that captures all of the possible resolvents.) But we 
can't simply require that the augmented resolvent {c,G) have G Q extn(cj,Gi), because 
there is no obvious polynomial test for inclusion of a group in a set.^^ 

To overcome these difficulties, we need a version of Definition 5.6 that produces a group 
of extensions, as opposed to just a set: 

Definition 5.7 For Ki, . . . , Kn Q L and Gi, . . . , Gn < Sym(L), we will say that a per- 
mutation u G Sym(L) is a stable extension of \G\, . . . , Gn\ for {K\, . . . , Kn} if there are 
gi G Gi such that for all i, (jo\^Gi = gi\j^Gi. We will denote the set of stable extensions of 

{Gi, . . . ,G„} for {Ki, . . .,Kn}by stab(k,, Gi). 

11. We can't write G < extn(ci, d) because extn(ci, d) may not be a group. 

12. It is possible to test in polynomial time if G < H, since we can simply test each generator of G for 
membership in H. But if H is not closed under the group operation, the fact that the generators are all 
in H is not sufficient to conclude that G C H. 
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This definition is modified from Definition 5.6 only in that the restriction of to is not just 
to the original sets Ki for each i, but to K'f\ the Gj-closure of Ki (recall Definition 3.8). 

In our example where Ki = {a, b} and K2 = {^a, e}, with Gi = Sym{6, c, d} and G2 = 
{{ed)), a stable extension must be a member of Sym(6, c, d) when restricted to {a,b,c,d} 
(the Gi-closure of Ki), and must be a member of {{ed)) when restricted to {d,e}. This 
means that we do care where a candidate permutation maps c and d, so that the set of 
stable extensions, instead of (22), is instead simply {(), (6c)} = {{be)). The fact that d has 
to be mapped to b, c, or d by virtue of Gi and has to be mapped to either d or e by virtue 
of G2 means that d has to be fixed by any permutation in stab{Ki,Gi), which is why the 
resulting set of stable extensions is so small. 

In general, we have: 

Proposition 5.8 stab{Ki,Gi) < Sym(L). 

In other words, stah^Ki, Gi) is a subgroup of Sym(L). 

At this point, we still need to deal with the monotonicity condition (3) of Definition 5.4. 
After all, if we have 

{c,G)^{c',G') 

we should also have 

resolve((c, G), {d, H)) \= resolve((c', G'), {d, H)) 
To see why this is an issue, suppose that we are resolving 

(aV6,Sym{6,c,d}) (23) 

with 

(-aVe,(ed)) (24) 

Because the groups both act on d, we have already seen that if we take the group of stable 

extensions as the group in the resolvent, we will conclude (6 V e, (be)). But if we replace 
(24) with (-la V e, 1) before resolving, the result is the stronger (6 V e, Sym{6, c, d}). If we 
replace (23) with (a V b, {be)), the result is the different but also stronger (6 V e, {{be), {de))) 
The monotonicity considerations suggest: 

Definition 5.9 Suppose that (ci,Gi) and (02,^2) o,re augmented clauses where e\ and C2 
resolve in the conventional sense. Then a resolvent of (ci,Gi) and (£2,^2) is any aug- 
m,ented clause of the form (resolve(ci, C2), G) where G < stab(cj, ifj) n W„ for some 
Hi < Gi for i = 1,2. The canonical resolvent of (ci,Gi) and (02,^2), to be denoted by 
resolve((ci, Gi), (c2, G2)), is the augmented clause (resolve(ci, C2), stab(cj, Gj) fl IV„). 

Proposition 5.10 Definition 5.9 of (noncanonical) resolution is satisfactory. The def- 
inition of canonical resolution satisfies all of the conditions of Definition 5.4 except for 
monotonicity. 
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Before proceeding, let us consider the example that preceded Definition 5.9 in a bit more 
detail. There, we were looking for augmented resolvents of (a V b, Sym{6, c, d}) from (23) 
and (-la V e, (ed)) from (24). 

To find such a resolvent, we begin by selecting Hi < Gi = Sym{6, c, d} and H2 < G2 = 
{{ed)). We then need to use Definition 5.7 to compute the group of stable extensions of 
{ci,Hi) and (c2,i^2)- 

If we take Hi and H2 to be the trivial group 1, then the group of stable extensions is 
also trivial, so we see that 

(resolve(a V 6, V e), 1) = (6 V e, 1) 

is a resolvent of (23) and (24). Other choices for Hi and H2 are more interesting. 

If we take Hi = 1 and H2 = G2, the stable extensions leave the first clause fixed but can 
move the image of the second consistent with G2 ■ This produces the augmented resolvent 
(6 Ve, (de)). 

If, on the other hand, we take Hi = Gi and iif2 = 1, we have to leave e fixed but can 
exchange b, c and d freely, and we get (b V e, Sym{6, c, d}) as the resolvent. 

If Hi = Gi and H2 = G2, we have already computed the group of stable extensions in 
earlier discussions of this example; the augmented resolvent is (6 V e, {be)), which is weaker 
than the resolvent of the previous paragraph. And finally, if we take Hi = {{be)) and 
H2 = G2, we can exchange b and c or independently exchange d and e so that we get the 
augmented resolvent (6 V e, ((6c), {de))). These choices have already been mentioned in the 
discussion of monotonicity that preceded Definition 5.9. 

There is a variety of additional remarks to be made about Definition 5.9. First, canonical 
resolution lacks the monotonicity property, as shown by our earlier example. In addition, the 
resolvent of two augmented clauses can obviously depend on the choice of the representative 
elements in addition to the choice of subgroup of stab(ci, Gi). Thus, if we resolve 

(^1,(^1^2)) (25) 

with 

(-/i,l) (26) 

we get a contradiction. But if we rewrite (25) so that we are attempting to resolve (26) 
with 

(^2,(^1^2)) 

no resolution is possible at all. 

To address this in a version of RBL that has been lifted to our more general setting, 
we need to ensure that if we are trying to resolve (ci,Gi) and (c2,G2), the base instances 
ci and C2 themselves resolve. As wc will sec, this can be achieved by maintaining ground 
reasons for each literal in the annotated partial assignment. These ground clauses will 
always resolve when a contradiction is found and the search needs to backtrack. 

We should also point out that there are computational issues involved in the evaluation 
of stab(cj, Gi). If the component groups Gi and G2 are described by listing their elements, 
an incremental construction is possible where generators are gradually added until it is 
impossible to extend the group further without violating Definition 5.9. But if Gi and G2 
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are described only in terms of their generators, as suggested by the results in Section 4.3, 
computing stab(cj, Gj) involves the following computational subtasks (Dixon et al., 2004a) 
(recall Definition 3.9):^^ 

1. Given a group G and set C, find G^c}- 

2. Given a group G and set C, find Gc- 

3. Given two groups Gi and G2 described in terms of generators, find a set of generators 
for Gi n G2. 

4. Given G and C, let u G ^{C}- Now a;|c, the restriction of oj to C, makes sense 
because = C. Given a p that is such a restriction, find an element p' E G such 
that p'\c = P- 

We will have a great deal more to say about these issues in the paper describing the 
ZAP implementation. At this point, we remark merely that time complexity is known to 
be polynomial only for the second and fourth of the above problems; the other two are not 
known to be in polynomial time. However, computational group theory systems incorporate 
procedures that rarely exhibit super-polynomial behavior even though one can construct 
examples that force them to take exponential time (as usual, in terms of the number of 
generators of the groups, not their absolute size). 

In the introduction, we claimed that the result of resolution was unique using reasons 
and that ZAP's fundamental inference step was both in NP with respect to the ZAP repre- 
sentation and of low-order polynomial complexity in practice. The use of reasons breaks 
the ambiguity surrounding (25) and (26) , and the remarks regarding complexity correspond 
to the computational observations just made. 

6. Examples and Proof Complexity 

Let us now turn to the examples that we have discussed previously: first-order axioms that 
are quantified over finite domains, along with the standard examples from proof complexity, 
including pigeonhole problems, clique coloring problems and parity clauses. For the first, 
we will see that our ideas generalize conventional notions of quantification while providing 
additional representational flexibility in some cases. For the other examples, we will present 
a ground axiomatization, recast it using augmented clauses, and then give a polynomially 
sized derivation of unsatisfiability using augmented resolution. 

6.1 Lifted Clauses and QPROP 

To deal with lifted clauses, suppose that we have a quantified clause such as 

\/xyz.a{x, y) V 6(y, z) V c{z) (27) 

13. There is an additional requirement that wo bo able to compute stab(ci,G'i) n Wn from stab(ci, Gi). 
This is not an issue in practice because wc work with an overall representation in which all groups are 
represented by their intersections with Wn. Thus if g is included as a generator for a group G, we 
automatically include in the generators for G the "dual" permutation to g where every literal has had 
its sign flipped. 
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Suppose that the domain of x is X, that of y is Y, and that of z is Z. Thus a grounding of 
the clause (27) involves working with a map that takes three elements x X , y ^ Y and 
z & Z and produces the ground atoms corresponding to a{x,y), b{y,z) and c{z). In other 
words, if V is the set of variables in our problem, there are injections 



a 


X xY - 


V 


b 


Y X Z - 


> V 


c 


z 





where the images of a, b and c are disjoint and each is an injection because distinct relation 
instances must be mapped to distinct ground atoms. 

Now given a permutation uj of the elements of X, we can define a permutation pxiy^) 
on V given by: 

yv; otherwise. 

In other words, there is a mapping px from the set of permutations on X to the set of 
permutations on V: 

PX : Sym(X) ^ Sym(y) 

Definition 6.1 Let G and H be groups and f : G ^ H a function between them, f will be 
called a homomorphism if it respects the group operation in that /(5152) = f{gi)f{92)- 

It should be clear that: 

Proposition 6.2 px '■ Sym(X) Sym{V) is an injection and a homomorphism. □ 

In other words, px makes a "copy" of Sym(X) inside of Sym(y) corresponding to permuting 
the elements of x's domain X. 

In a similar way, we can define homomorphisms py and pz given by 



' a{x,uj{y)), iiv = a{x,y); 
bi^{y),z), iiv = b{y,z); 
v: otherwise. 



' b{y,uj{z)), if v = b{y,z); 
c{u;{z)), if V = c{z); 



pYiu)){v) 

and 

^ v; otherwise. 

Now consider the subgroup of Sym(y) generated by the three images px{Sym{X)), 
/9y(Sym(y)) and p^(Sym(Z)). It is clear that the three images commute with one another, 
and that their intersection is only the trivial permutation. This means that px, Py and pz 
collectively inject the product Sym(X) x Sym(y) x Sym(Z) into Sym(y); we will denote 
this by 

PXYZ : Sym(X) x Sym(y) x Sym(Z) ^ Sym(y) 

and it should be clear that the original quantified axiom (27) is equivalent to the augmented 
axiom 

{a{A, B) V b{B, C) V c(C), pxyz(Sym(X) x Sym(y) x Sym(Z))) 
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where A, B and C are any (not necessarily distinct) elements of X, Y and Z, respectively. 
The quantification is exactly captured by the augmentation. 

The interesting thing is what happens to resolution in this setting: 

Proposition 6.3 Let p and q he quantified clauses such that there is a term tp in p and 

-^tq in q where tp and tq have common instances. Suppose also that (pg, P) is an augmented 
clause equivalent to p and {qg,Q) is an augmented clause equivalent to q, where pg and qg 
resolve. Then if no terms in p and q except for tp and tg have common instances, the result 
of resolving p and q in the conventional lifted sense is equivalent to resolve((pg, P), {qg, Q)). 

Here is an example. Suppose that p is 

a{A,x)W b{C,y,z)y c{x,y,z) (28) 

and q is 

a{B,x)V ^b{x,D,z) (29) 

so that the most general unifier of the two appearances of b binds x to C in (29) and y to 
D in (28) to produce 

a{A,x)\/ c{x,D,z)\/ a{B,C) (30) 

In the version using augmented clauses, it is clear that if we select ground instances pg of 
(29) and qg of (28) that resolve, the resolvent will be a ground instance of (30); the interesting 
part is the group. For this, note simply that the image of pg is the entire embedding of 
Sym(X) X Sym(y) x Sym(Z) into Sym(L) corresponding to the lifted axiom (28), and the 
image of qg is similarly the embedded image of Sym(X) x Sym(Z) corresponding to (29). 

The group of stable extensions of the two embeddings corresponds to any bindings for 
the variables in (28) and (29) that can be extended to a permutation of all of the variables in 
question; in other words, to bindings that (a) are consistent in that common ground literals 
in the two expressions are mapped to the same ground literal by both sets of bindings, and 
(b) are disjoint in that we do not attempt to map two quantified literals to the same ground 
instance. This latter condition is guaranteed by the conditions of the proposition, which 
require that the non-resolving literals have no common ground instances. In this particular 
example, if we choose the instances 

a{A, a) V b{C, D, (3) V c(a, D, (3) 

for (28) and 

a(S, (7) V-6(C, £>,/?) 
for (29), the resulting augmented clause is 

{a{A,a)\/ c{a,D,P)y a{B,C),G) (31) 

where G is the group mapping Sym(X) x Sym(Z) into Sym(L) so that (31) corresponds to 
the quantified clause (30). 

The condition requiring lack of commonality of ground instances is necessary; consider 
resolving 

a{x) V b 
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with 

a{y) V -6 

In the quantified case, we get 

yxy.a{x) V a(y) (32) 

In the augmented case, it is not hard to see that if we resolve (a(A) V6, G) with (a(^) V-i6, G) 
we get 

{a{A),G) 

corresponding to 

yx.a{x) (33) 
while if we choose to resolve {a{A) V b, G) with {a{B) V G), we get instead 

7^ y-a{x) V a(y) 

It is not clear which of these representations is superior. The conventional (32) is more 
compact, but obscures the fact that the stronger (33) is entailed as well. This particular 
example is simple, but other examples involving longer clauses and some residual unbound 
variables can be more complex. 



6.2 Proof Complexity 

We conclude this section with a demonstration that zap can produce polynomial proofs of 
the problem instances appearing in the original table of the introduction. 



6.2.1 Pigeonhole Problems 

Of the examples known to be exponentially difficult for conventional resolution-based sys- 
tems, pigeonhole problems are in many ways the simplest. As usual, we will denote by pij 
the fact that pigeon i (of n -I- 1) is in hole j of n, so that there are 'n? + n variables in 
the problem. We denote by G the subgroup of W„2_|_„ that allows arbitrary exchanges of 
the n + \ pigeons or the n holes, so that G is isomorphic to Sn+i x Sn- This particular 
example is straightforward because there is a single global group that we will be able to use 
throughout the analysis. 

Our axiomatization is now: 

(-P11 V-p2i,G) (34) 
saying that no two pigeons can be in the same hole, and 

(pii V---Vpi„,G) (35) 

saying that the first (and thus every) pigeon has to be in some hole. 

Proposition 6.4 There is an augmented resolution proof of polynomial size of the mutual 
unsatisfiability of (34) and (35). 

Proof. This is a consequence of the stronger Proposition 6.5 below, but we also present an 
independent proof in the appendix. The ideas in the proof are needed in the analysis of the 
clique-coloring problem. □ 
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Proposition 6.5 Any implementation of Procedure 2.8 that branches on positive literals 
in unsatisfied clauses on line 12 will produce a proof of polynomial size of the mutual un- 
satisfiability of (34) and (35), independent of specific branching choices made. 

This strikes us as a remarkable result: Not only is it possible to find a proof of polynomial 
length in the augmented framework, but in the presence of unit propagation, it is difficult 
not to! 

A careful proof of this result is in the appendix, but it will be useful to examine in detail 
how a prover might proceed in a small (four pigeon, three hole) example. 

We begin by branching on (say) pn, saying that pigeon one is in hole one. Now unit 
propagation allows us to conclude immediately that no other pigeon is in hole one, so our 
annotated partial assignment is: 



literal 


reason 


Pii 


true 




-ipii V -ip21 




-.pu V -.psi 


-'P41 


^Pll V ^P41 



Now we try putting pigeon two in hole two, and extend the above partial assignment 
with: 



literal 


reason 


P22 


true 


^Pl2 


-^P22 V -1^12 


-^P32 


-^P22 V -1^32 


-'P42 


-^P22 V -1^42 



At this point, however, we are forced to put pigeons three and four in hole three, which 
leads to a contradiction _L: 



literal 


reason 


P33 


P31 V P32 V P33 


PA3 


PAI V P42 V PA3 


± 


-^P33 V ^P43 



Resolving the last two reasons produces -ipaa y pn Vp42, which we can resolve with the 
reason for P33 to get p4i Vp42 Vp3i Vp32. Continuing to backtrack produces p4i V -'P22 Vp3i. 

Operating on the clause pn V -ip22 V p^i with the usual symmetry group (swapping hole 
2 and hole 3) produces p4i V ^P23 Vp3i, and now there is nowhere for pigeon two to go. We 
resolve these two clauses with p2i ^ P22 ^ P23 to get p4i Vpai Vp2i, and thus -ipn. This 
implies ^Pij for all i and j under the usual symmetry, and we conclude that the original 
axiomatization was unsatisfiable. 

14. We cannot conclude that pigeon two is in hole two "by symmetry" from the existing choice that pigeon 
one is in hole one, of course. The symmetry group can only be applied to the original clauses and to 
derived nogoods, not to branch choices. Alternatively, the branch choice corresponds to the augmented 
clause (pii,l) and not (pii,G). 
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6.2.2 Clique Coloring Problems 

The pigeonhole problem is difficult for resolution but easy for many other proof systems; 
clique coloring problems are difficult for both resolution and other approaches such as 
pseudo-Boolean axiomatizations (Pudlak, 1997). 

Clique coloring problems are derivatives of pigeonhole problems where the exact nature 
of the pigeonhole problem is obscured. Somewhat more specifically, they say that a graph 
includes a clique of 1 nodes (where every node in the clique is connected to every other), 
and that the graph must be colored in n colors. If the graph itself is known to be a clique, 
the problem is equivalent to the pigeonhole problem. But if we know only that the clique 
can be embedded into the graph, the problem is more difficult. 

To formalize this, we will use Cij to describe the graph, to describe the coloring of 
the graph, and qij to describe the embedding of the clique into the graph. The graph has 
m nodes, the clique is of size n + 1, and there are n colors available. The axiomatization is: 



Here Cij means that there is an edge between graph nodes i and j, Cij means that graph 
node i is colored with the j'th color, and qij means that the ith element of the clique is 
mapped to graph node j. Thus the first axiom (36) says that two of the m nodes in the 
graph cannot be the same color (of the n colors available) if they are connected by an edge. 
(37) says that every graph node has a color. (38) says that every element of the clique 
appears in the graph, and (39) says that no two elements of the clique map to the same 
node in the graph. Finally, (40) says that the clique is indeed a clique - no two clique 
elements can map to disconnected nodes in the graph. As in the pigeonhole problems, there 
is a global symmetry in this problem in that any two nodes, clique elements or colors can 
be swapped. 

Proposition 6.6 There is an augmented resolution proof of polynomial size of the unsat- 
isfiability of (36)-(40). 

The proof in the appendix presents a ZAP proof of size 0{m?n^) for clique-coloring 
problems, where m is the size of the graph and n is the size of the clique. The ZAP 
implementation produces shorter proofs, of size 0((m + n)2) (Dixon et al., 2004a). While 
short, these proofs involve the derivation and manipulation of subtle clauses and are too 
complex for us to understand. 

Before we move on to parity clauses, note that the approach we are proposing is properly 
stronger than one based on "symmetry-breaking" axioms (Crawford, Ginsberg, Luks, & 
Roy, 1996) or the approaches taken by Krishnamurthy (1985) or Szeider (2003). In the 

15. It is not clear whether one should conclude from this something good about zap, or something bad about 
the authors. Perhaps both. 



-lejj V -iCjj V ^Cji 

Cii V • • • V Cin 

ga V • • • V Qim 
eij V -^Qki V ^qij 



ioY 1 < i < j < m, I = 1, . . . ,n 
for i = 1, . . . , m 
for i = 1, . . . , n + 1 

for l<i<k<n+l, j = l,...,m 
for 1 < i < j < m, I < k I < n + 1 



(36) 
(37) 
(38) 
(39) 
(40) 
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symmetry-breaking approach, the original axiom set is modified so that as soon as a single 
symmetric instance is falsified, so are all of that instance's symmetric variants. Both we and 
the other authors (Krishnamurthy, 1985; Szeider, 2003) achieve a similar effect by attaching 
a symmetry to the conclusion; either way, all symmetric instances are removed as soon as 
it is possible to disprove any. Unlike all of these other authors, however, an approach based 
on augmented clauses is capable of exploiting local symmetries present in a subset of the 
entire axiom set. The other authors require the presence of a global symmetry across the 
entire structure of the problem. 

6.2.3 Parity Clauses 

Rather than discuss a specific example here, we show that determining the satisfiability of 
any set of parity clauses is in P for augmented resolution. The proof of this is modeled on 
a proof that satisfiability of parity clauses is in P: 

Lemma 6.7 Let C be a theory consisting entirely of parity clauses. Then determining 

whether or not C is satis fiahle is in P. 

As discussed in the introduction, the proof is basically a Gaussian reduction argument. 

Definition 6.8 Let S he a subset of a set of n variables. We will say that a permutation 
u flips a variable v if uj{v) = ^v, and will denote by Fs that subset ofWn consisting of all 
permutations that leave the variable order unchanged and flip an even number of variables 
in S. 

Lemma 6.9 Fs < Wn. 

We now have the following: 

Lemma 6.10 Let S = {xi, . . . ,Xk} be a subset of a set of n variables. Then the parity 
clause 

is equivalent to the augmented clause 

(xi V--- Vxfe,F5) 

The parity clause 

= 

is equivalent to the augmented clause 

VX2 V--- Vxjk,F5) 

We can finally show: 

Proposition 6.11 LetC be a theory consisting entirely of parity clauses. Then determining 
whether or not C is satisfiable is in P for augmented resolution. 
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We note in passing that the construction in this section fails in the case of modularity 
clauses with a base other than 2. One of the (many) problems is that the set of permutations 
that flip a set of variables of size congruent to m (mod n) is not a group unless m = and 
n < 3. We need m = for the identity to be included, and since both 

and 

(X2, -1X2) • • • (Xn+l, -'Xn+l) 

are included, it follows that 

{xi,^Xi){Xn+l,^Xn+l) 

must be included, so that n = 1 or n = 2. 

It is not clear whether this is coincidence, or whether there is a deep connection between 
the fact that mod 2 clauses can be expressed compactly using augmented clauses and are 
also solvable in polynomial time. 

7. Theoretical and Procedural Description 

In addition to resolution, an examination of Procedures 2.8 and 2.6 shows that we need to be 
able to eliminate nogoods when they are irrelevant and to identify instances of augmented 
clauses that are unit. Let us now discuss each of these issues. 

The problems around irrelevance are easier to deal with. In the ground case, we remove 
clauses when they are no longer relevant; in the augmented version, we remove clauses 
that no longer possess relevant instances. We will defer until the final paper in this series 
discussion of a procedure for determining whether (c, G) has a relevant instance. 

We will also defer discussion of a specific procedure for computing uiiit-propagate(P), 
but do include a few theoretical comments at this point. In unit propagation, we have a 
partial assignment P and need to determine which instances of axioms in C are unit. To 
do this, suppose that we denote by S(P) the set of Satisfied literals in the theory, and by 
U{P) the set of Unvalued literals. Now for a particular augmented clause {c,G), we are 
looking for those g & G such that n S{P) = and |c3 n U{P)\ < 1. The first condition 
says that has no satisfied literals; the second, that it has at most one unvalued literal. 

Procedure 7.1 (Unit propagation) To compute '[Jnit-Propagatf,{C, P) for a set C of 

augmented clauses and an annotated partial assignment P = ((/i, ci), . . . , (Z„, Cn)): 

1 while there is a (c, G) e C and g e G with n S{P) = and \cS n U{P)\ < 1 

2 do if n U{P) = 

3 then li ^ the literal in with the highest index in P 

4 return (true, resolve((c, G), q)) 

5 else / <— the literal in unassigned by P 

6 add (/,(c»,G)) to P 

7 return (false, P) 
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Note that the addition made to P when adding a new hteral includes both c^, the instance 
of the clause that led to the propagation, and the augmenting group as usual. We can use 
(c^, G) as the augmented clause by virtue of Proposition 4.7. 
Finally, the augmented version of Procedure 2.8 is: 

Procedure 7.2 (Relevance-bounded reasoning, rbl) Let C be a SAT problem, and D 
a set of learned nogoods. Let P be an annotated partial assignment, and k a fixed relevance 
bound. To compute KBl{C, D, P): 

1 {x, y) <— Unit-Propagate(C U D, P) 

2 if a; = true 



3 then (c, G) ^ y 

4 if c is empty 

5 then return failure 

6 else remove successive elements from P so that c is unit 

7 D ^ D U {c} 

8 remove from D all augmented clauses without A;-relevant instances 

9 return rbl(C, D, P) 

10 else P <— y 

11 if P is a solution to C 

12 then return P 

13 else I <— a literal not assigned a value by P 

14 return rbl(C, (P, {I, true))) 



Examining these two procedures, we see that we need to provide implementations of the 
following: 

1. A routine that computes the group of stable extensions appearing in the definition of 
augmented resolution, needed by line 4 in the unit propagation procedure 7.1. 

2. A routine that finds instances of (c, G) for which (1 S = and n J7| < 1 for 
disjoint S and U, needed by line 1 in the unit propagation procedure 7.1. 

3. A routine that determines whether {c,G) has an instance for which poss(c^,P) < k 
for some fixed k, as needed by line 8 of Procedure 7.2. 

All of these problems are known to be NP-complete, although we remind the reader that 
we continue to measure complexity in terms of the size of the domain and the number of 
generators of any particular group; the number of generators is logarithmic in the number 
of instances of any particular augmented clause. It is also the case that the practical 
complexity of solving these problems appears to be low-order polynomial. 

Our focus in the final paper in this series will be on the development of efficient proce- 
dures that achieve the above goals, their incorporation into a zCHAFF-like prover, and an 
evaluation of the performance of the resulting system. 
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8. Conclusion 

Our aim in this paper has been to give a theoretical description of a generahzed represen- 
tation scheme for satisfiabihty problems. The basic building block of the approach is an 
"augmented clause," a pair (c, G) consisting of a ground clause c and a group G of permu- 
tations on the literals in the theory; the intention is that the augmented clause is equivalent 
to the conjunction of the results of operating on c with elements of G. We argued that the 
structure present in the requirement that G be a group provides a generalization of a wide 
range of existing notions, from quantification over finite domains to parity clauses. 

We went on to show that resolution could be extended to deal with augmented clauses, 
and gave a generalization of relevance-bounded learning in this setting (Procedures 7.1 and 
7.2). We also showed that the resulting proof system generalized first-order techniques when 
applied to finite domains of quantification, and could produce polynomially sized proofs of 
the pigeonhole problem, clique coloring problems, Tseitin's graph coloring problems, and 
parity clauses in general. 

Finally, we described the specific group-theoretic problems that need to be addressed in 
implementing our ideas. As discussed in the previous section, they are: 

1. Implementing the group operation associated with the generalization of resolution, 

2. Finding unit instances of an augmented clause, and 

3. Determining whether an augmented clause has relevant instances. 

We will return to these issues in the final paper in this scries (Dixon ct al., 2004a), which 
describes an implementation of our ideas and its computational performance. 
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Appendix A. Proofs 

Proposition 2.7 Suppose that C is a Boolean satisfiability problem, and P is a sound 
annotated partial assignment. Then: 
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1. //"vinit-propagate(P) = (false, P'), then P' is a sound extension of P, and 

2. //unit-propagate(P) = (true,c), then c is a nogood for P. 

Proof. In the first case, we need to show that any extension of P in the procedure leaves 
P a sound partial assignment. In other words, when we add {I, c) to P, we must show that: 

1. C^c, 

2. / appears in c, and 

3. Every other literal in c is false by virtue of an assignment in P. 

For (1), note that c G C. For (2), / is explicitly set to a literal in c. And for (3), since 
c € possq(C, P), every other literal in c must be set false by P. 

In the second case in the proposition, C \= c because c is the result of resolving a clause 
in C with some reason Cj, which is entailed by C by virtue of the soundness of P. To see 
that c is falsified by P, note that the clause in poss_^(C, P) is surely falsified by P, and 
that every literal in the reason q for li is also falsified except for /j itself. It follows that the 
result of resolving these two clauses will also be falsified by the assignments in P. □ 

Theorem 2.9 Rbl is sound and complete in that it will always return a solution to a 
satisfiable theory C and always report failure if C is unsatisfiable. Rbl also uses an amount 
of memory polynomial in the size of C (although exponential in the relevance bound k). 
Proof. Soundness is immediate. For completeness, note that every nogood learned elimi- 
nates an additional portion of the search space, and the backtrack is constrained to not go 
so far that the newly learned nogood is itself removed as irrelevant. 

For the last claim, we extend Definition 2.3 somewhat, defining a reason for a literal I to 
be any learned clause involving / where I was the most recently valued literal at the point 
that the clause was learned. We will now show that for any literal I, there are never more 
than (2n)'^ reasons for I, where n is the number of variables in the problem. 

To see this, let R be the set of reasons for I at some point. Let r be any reason in this 
set; between the time that r <= R was learned and the current point, at most k literals in 
r could have been unassigned by the then-current partial assignment. It follows that there 
is some fixed partial assignment P' that holds throughout the "life" of each r E R and 
such that each r has at most k literals unassigned values by P'. Let S be the set of literals 
assigned values by P'. 

Given a reason G R, we will view simply as the set of literals that it contains, so 
that ri — S is the set of literals appearing in but outside of the stable partial assignment 
P' . Now if rj was learned before rj, some literal li ^ r^ — S must not be in rj — S; otherwise, 
rj together with the stable partial assignment P' would have precluded the set of variable 
assignments that led to the conclusion r,. In other words, rj — 5 is unique for each reason 
in the set R. 

But we also know that |rj — S\ < k, so that each reason corresponds to choosing at most 
k literals from the complement of S. If there are n variables in the problem, there are most 
2n literals in this set, so that the number of reasons is bounded by (2n)'^. It follows that the 
total number of reasons learned is bounded by (2n)*^"^^, and the conclusion follows. □ 

Theorem A.l (Lagrange) If G is a finite group and S <G, then \S\ divides \G\. □ 



516 



ZAP 2: Theory 



Proposition 4.3 Let S be a set of ground clauses, and (c, G) an equivalent augmented 
clause, where G is represented by generators. It is possible in polynomial time to find a set 
of generators {toi, . . . , Wfc} where k < log2 |G| and G = (cji, . . . ,LOk)- 

Proof. Even the simplest approach suffices. If G = {gi), checking to see if gi € (wi, . . . ,u)j) 
for each generator gi can be performed in polynomial time using a well-known method of 
Sims (Dixon ct al., 2004a; Luks, 1993; Seress, 2003); li gi is already in the generated set we 
do nothing and otherwise we add it as a new generator. By virtue of Lagrange's theorem, 
a subgroup can never be larger than half the size of a group that properly contains it, so 
adding a new generator to the set of UiS always at least doubles the size of the generated 
set. It follows that the number of generators needed can never exceed log2 \G\. □ 

Proposition 4.7 Let {c,G) be an augmented clause. Then if c' is any instance of{c,G), 
{c,G) = {c',G). 

Proof. Since (/ is an instance of (c, G), we must have d = c^ for some g E G. Thus the 

instances of (c', G) are clauses of the form c'^ = c^^ . But c^^ = for g" = gg' G G. 
Similarly, an instance of (c, G) is a clause of the form = c'^ ^ = c'^ . □ 

Definition A. 2 Let G < Sym(S'). We will say that G acts transitively on S if, for any 

x,y G S, there is a g E G with = y. 

Proposition 4.8 Let (c, G) be an augmented clause with d distinct instances. Then there 
is a subgroup H <G that can be described using 0(log2(cZ)) generators such that {c,H) = 
(c, G) . Furthermore, given d and generators for G, there is a Monte Carlo polynomial-time 
algorithm for constructing the generators of such an H. 

Proof. The basic ideas in the proof follow methods introduced by Babai, Luks and Ser- 
ess (1997). The proof of this particular result is a bit more involved than the others in this 
paper, and following it is likely to require an existing familiarity with group theory. 

Let D be the set of instances of (c, G), so that G acts transitively on D. Now consider 
a sequence gi,g2, . . . of uniformly distributed random elements of G and, for each r > 0, 
let Hr = {gi,g2, ■ ■ ■ ,gr) (in particular, Hq = (0) = 1). Suppose that Hr-i does not act 
transitively on D and let K be any orbit of Hr-i in D. Since G^k} is a proper subgroup 
of G, Lagrange's theorem implies that the probability that g^ € G{/^} is < ^. Hence, the 
probability that Hr enlarges this K is > On average then, at least ^ of the orbits will be 
enlarged in passing from Hr-i to Hr. Since the orbits partition the entire set D, an orbit 
can only be enlarged if it is merged with one or more other orbits. Thus the fact that at 
least half of the orbits are enlarged implies that the total number of such orbits is reduced 
by at least ^. Thus for each r, the expected number of orbits of H^ in D is < d(3/4)^. As a 
consequence, with high probability, there exists r = 0(log2 d) such that Hj. acts transitively 
on D. (The probability of failure can be kept below e for any fixed e > 0.) 

An implementation of the algorithm implicit in this proof requires the ability to select 
uniformly distributed random elements of G. These are available at a cost O(f^) per 
element, given the standard data structures for permutation group computation (Seress, 
2003). 16 □ 

16. The reason that we only have a Monte Carlo method is that there is no known deterministic polynomial 
time tost for testing whether H < G acts transitively on D; note that D may be exponential in the 
number of variables in the problem. 
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Proposition 5.1 For ground clauses c\ and ci and a permutation u G Wn, 

resolve(a;(ci), u;(c2)) = a;(resolve(ci, C2)) 

Proof. Suppose that the literal being resolved on is I, so that if we think of ci and C2 as 
being represented simply by the literals they contain, the resolvent corresponds to 

Ci U C2 - {I, ^1} 

Permuting with lo gives us 

i0{ci) U C0{C2) - {(Jj{l),iji^l)} = L0{ci) U LJ{C2) - {lo{1), -^io{l)} (41) 

where the equality is a consequence of the fact that the permutation in question is a member 
oiWn instead of simply S2n- The right hand side of (41) is simply resolve(u;(ci), u;(c2)). □ 
Proposition 5.3 There are augmented clauses c\ and C2 such that the set S of resolvents 
of instances of the two clauses does not correspond to any single augmented clause (c, G) . 
Proof. Consider resolving the augmented clause 

ci = (a V 6, (6c)) 

with the two instances a V 6 and a V c, with the augmented clause 

C2 = (-la V d, (dc)) 

corresponding to -la V d and -la V c. The ground clauses that can be obtained by resolving 
instances of c\ and of C2 are 6 V d, 6 V c, c V 6, and c. Since these clauses are not of uniform 
length, they are not instances of a single augmented clause. □ 

Proposition 5.8 sta.\>{Ki,Gi) < Sym(L). 

Proof. Suppose wc have uji,uj2 G stab(i^j, Gi). Now for some fixed i and gi € Gi, a;i|^Gj = 

- * 
5i| Gj and similarly for i02 and some g2- But now for any x & K- % 

i 

SO that u;iW2 € stab(iCi, Gi). The first equality holds by virtue of the definition of a stable 
extension, and the second holds because x^^ is necessarily in the Gj-closure of Kj. Inversion 
is similar. □ 

Definition 5.4 A definition of augmented resolution will be called satisfactory if any resol- 
vent (c, G) o/(ci,Gi) and (02,^2) satisfies the following conditions: 

1. It is soimd. in that (ri.Gi) A (02,^2) |= (c,G). 

Note also that the algorithm explicitly requires that we know d in advance. This is necessary 
since the quantity is not known to be computable in polynomial time. However, there are methods for 
computing d that seem to be efficient in practice. 
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2. It is complete, in that resolve(ci, C2) is an instance of (c,G). 

3. It is monotonic, in that if Gi < Hi and Gi < H2, then (c, G) is also a resolvent of 
{ci,Hi) and {02, H2). 

4- It is polynomial, in that it is possible to confirm that (c, G) is a resolvent of (ci, Gi) 
and (c2,G2) in polynomial time. 

5. It is stable, in that if cf = C2 , then (resolve(ci, C2), G) is a resolvent of {ci,G) and 

{C2,G) 

6. It is strong, in that if no element of c^^ is moved by G2 and similarly no element 
of is moved by Gi, then (resolve(ci, C2), (Gi, G2)) is a resolvent o/(ci,Gi) and 

(C2,G2). 

Proposition 5.10 Definition 5.9 of (noncanonical) resolution is satisfactory. The def- 
inition of canonical resolution satisfies all of the conditions of Definition 5.4 except for 

m,onotomcity. 

Proof. Wc deal with the conditions of the definition one at a time. 

1. Soundness Any instance of (c, G) must be of the form 

a;(resolve(ci, C2))) 

for some u that simultaneously extends Gi acting on ci and G2 acting on C2. But by 
Proposition 5.1, this is just 

resolve(a;(ci), oj{c2)) 

The first of these clauses is an instance of (ci, Gi), and the second is an instance of (c2, G2), 
so the proposition follows from the soundness of resolution. 

2. Completeness resolve(ci, C2) is an instance of (c, G) because c = resolve(ci, C2) 

and 1 G G. 

3. Monotonicity If (c, G) is a resolvent of {ci,Hi) and {c2,H2), then G = stab(cj,i^j) 
where Ki < Hi. But since Hi < Gi, it follows that Ki < Gi and (c, G) is a resolvent of 
(ci,Gi) and (c2,G2) as well. 

4. Polytime checking We assume that we are provided with the intermediate groups Hi 
and H2, so that we must simply check that G < stab(ci, Hi). Since stab(cj, Hi) is a group 
by virtue of Proposition 5.8, it suffices to check that each generator of G is in stab(ci, ifj). 
But this is straightforward. Given a generator g, we need simply check that restricting g to 
Cj % the image of Cj under Gi, produces a permutation that is the restriction of an element 
of Gi. As we remarked in proving Proposition 4.3, this test is known to be in P. 

5. Stability It is clear that (resolve (ci, C2), G) is a resolvent of (ci, G) and (c2, G), since 
every element of G is clearly a stable extension of (ci, G) and (c2, G). 

Wc need the additional condition that of = C2 to show that (resolve(ci, C2), G) is the 
canonical resolvent; there is no explicit requirement that the group of stable extensions 
of (ci,G) and (c2,G) be a subgroup of G. But if cf = C2, the stable extensions must 
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agree with elements of G on = C2 , and hence must agree with elements of G on ci and 
C2 themselves, and therefore on resolve(ci, C2) as well. Thus the canonical resolvent is 

equivalent to (resolve(ci, C2), G). 

6. Strength It is clear that the group of stable extensions can never be bigger than 
{Gi,G2), since every permutation that either agrees with ci on Gi and with C2 on G2 is 
contained in this group. But if g = Ylgi is an element of (Gi,G2), with G Gi for i odd 
and gi G G2 for i even, then = op' °<^'*^^'' = cf' for g' = H- ^^^gi G Gi. The (c2, G2) case 
is similar, so (resolve(ci, C2), (Gi, G2)) is the canonical resolvent of (ci,G) and (c2,G). 
□ 

Proposition 6.3 Letp and q be quantified clauses such that there is a term tp inp and ^tq in 
q where tp and tg have common instances. Suppose also that {pg , P) is an augmented clause 
equivalent to p and (qg, Q) is an augmented clause equivalent to q, where Pg and qg resolve. 
Then if no terms in p and q except for tp and tq have common instances , the result of 
resolving p and q in the conventional lifted sense is equivalent to resolve((pg, P), {qg,Q)). 
Proof. The proof is already contained in the discussion surrounding the example in the 
main text. The base instance of the augmented resolvent is clearly an instance of the 
quantified resolution; for the group, we have already remarked that the group of stable 
extensions of the two embeddings corresponds simply to any bindings for the variables in 
the resolvents that can be extended to a permutation of all of the variables in question. This 
means that the bindings must be consistent with regard to the values selected for shared 
terms, and no two distinct quantified literals arc mapped to identical ground atoms. The 
latter condition follows from the assumption that the non-resolving literals have no common 
ground instances. □ 

Proposition 6.4 There is an augmented resolution proof of polynomial size of the mutual 

unsatisfiability of (34) and (35). 

Proof. We begin by explaining how the proof goes generally, and only subsequently provide 
the details. From the fact that the first pigeon has to be in one of the n holes, we can 
conclude that one of the first two pigeons must be in one of the last n — 1 holes (since these 
first two pigeons can't both be in the first hole). Now one of the first three pigeons must 
be in one of the last n — 2 holes, and so on until we conclude that one of the first n pigeons 
must be in the last hole. Similarly, one of the first n pigeons must be in each hole, leaving 
no hole for the final pigeon. 

To formalize this, wc will write Aj^ for the fact that one of the first k pigeons must be 
in one of the last n + l — k holes: 

Ak= y Pij 

l<i<k 

Our basic strategy for the proof will be to show that if we denote the original axioms (34) 
and (35) by PHP:^'^ 

1. PHP^Ai, 

2. PHP AAkh Ak+i, 

17. Our notation hero is vaguely similar to that used by Krishnamurthy (1985), although the both problem 
being solved and the techniques used are different. 



520 



ZAP 2: Theory 



3. PHP A ^„ h _L, where _L denotes a contradiction. 

In addition, since the same group G appears throughout the original axiomatization, we 
will drop it from the derivation, but will feel free to resolve against for any g E G and 
derived conclusion c. 

For the first claim, note that Ai is given by 

V 

l<j<n 

which is an instance of (35). 

For the second, we have A^, which is 

V 

l<i<k 

and we need to remove all of the variables pj^ that refer to the kth hole. To do this, we 
resolve the above clause with each of 

~'P2k V ^Pk+l,k 

^Pkk V ^Pk+i,k 

to get 

^Pk+i,k V y Pij 

l<i<k 
k+l<j<n 

Now note that the only holes mentioned in the disjunction on the right of the above 
expression are the A; + 1st and higher, so that we can apply the group G to conclude 

~'Pk+l,m V \/ Pij 

l<i<k 
k+l<j<n 

for any 1 < m < k. Now if we resolve each of these with the instance of (35) given by 

Pk+1,1 V ■ ■ ■ ypk+i,n 

we get 

V Pk+i,j^ V P'3 

k+l<j<n i<i<k 

k+l<j<n 

which is to say 

V Pij 

l<i<k+l 
k+l<j<n 

or Ak+i. 
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Finally, we need to derive a contradiction from A^, which is to say 




l<i<n 



Resolving with each of 



-"Pin 



V 



~^Pn+l,n 



V 



now gives -'Pn+i,n, and we can thus conclude -ipij for any i and j by acting with the group 
G. Resolving into any instance of (35) now gives the desired contradiction. □ 

Lemma A. 3 Assuming that we only branch on positive literals in unsatisfied clauses, let 

Pjk be any of the first n — 2 branch decisions in solving the pigeonhole problem. The set of 
unit propagations that result from this branch decision is exactly the set = {-'Pikli 7^ j}- 

Proof. We prove this by induction on the number of branch decisions. For the base case, 
we take n > 3 and consider the first branch decision pj^. For each -^pik G Sk there is 
an instance of (35) of the form -ipjk V -ipik that causes the unit propagation -ipik- No 
other instances of (35) contain literals that refer to hole k, so (35) produces no further unit 
propagations. Each instance of (34) has a total of n literals with at most one literal that 
refers hole k. Because n > 3, each instance must have at least two unvalued literals and 
therefore does not generate a unit propagation. 

For the inductive case, wc assume that Lemma A. 3 holds for the first m branches with 
m < n — 2. Under this assumption, each branch decision pjk and its subsequent unit 
propagations value exactly the variables involving hole k. We can therefore make the same 
argument as we did in the base case. Let pjk be the m + branch decision. Clause (35) 
produces exactly the set 5^ = {^Pik\i 7^ j} via unit propagation, and because m + l < n — 2, 
each instance of (34) has at least two unvalued literals and therefore does not generate any 
unit propagations. 

The key observation is that each branch decision and its subsequent unit propagations 
value all the variables (and only the variables) that refer to a particular hole. □ 

Lemma A.4 Let P = {li,l2, ■ ■ ■ , Im} be a partial assignment obtained in solving the pigeon- 
hole problem, where every branch decision is on a positive literal in an unsatisfied clause. 
For every branch decision k in P, the subproblem below the open branch {li,l2, ■ ■ ■ , k-i, -'k} 
can be solved by unit propagation. 

Proof. Assume we are about to begin exploring the subproblem below 



for some branch variable pj^. The subproblem below P = {li,l2, . . . , li,Pjk} has already 
been explored, found to be unsatisfiable, and we've generated a nogood defining the reason 
for the failure. This nogood will be an augmented clause of the form 



P = {ll,l2,---,li,^Pjk} 



(ai V • • • V a„i V -^pjk, G) 



(42) 
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The tti are unsatisfied under P = {li,l2, ■ ■ ■ ,k}, and G is the global symmetry group for 
the problem. 

But now recall that by virtue of Lemma A. 3, each of our original branch decisions 
together with its subsequent unit propagations valued all of the variables that referred to 
one particular hole and no more. Consider the set of all holes referred to by the partial 
assignment {li,l2, ■ ■ ■ ,h}- We will call this set H. 

When we branched on pj^, pigeon j had not yet been assigned to a hole. This follows 
from our assumption that we branch only on positive literals in unsatisfied clauses. Thus 
for all h e H, -ipjh G P; in other words, pigeon j was excluded from all of the holes in H 
prior to our decision to place it in hole k. The derived nogood (42) asserts that pigeon j 
cannot go in hole k either. 

But as in the small example worked in the main text, the nogood (42) represents more 
than a single clause. It represents the set of clauses that can be generated by applying 
permutations in G to oi V • • • V a„i V ^Pjk- If wc apply a permutation that swaps hole k 
with any hole g ^ H, the literals ai, . . . , am will be unchanged and will remain unsatisfied 
under P = {li, h, - ■ ■ , k}- So the clause 

ai V • • • V a,„ V ^pjg (43) 

is also an instance of (42) for any g ^ H, and (43) is also unit under the partial assignment 
P. The nogood (42) thus generates a series of unit propagations indicating that pigeon j 
cannot be in any hole not in H. Since the holes in H are already known to be excluded, 
there is no hole available for pigeon j. A contradiction occurs and the subproblem below 
P = {h,l2,---,k, ^Pjk} is closed. □ 

Proposition 6.5 Any implementation of Procedure 2.8 that branches on positive literals in 
unsatisfied clauses on line 12 will produce a proof of polynomial size of the mutual unsatis- 
fiability of (34) and (35), independent of specific branching choices made. 
Proof. Note first that any RBL search tree has size polynomial in the number of branch 
decisions, since the number of variable assignments that can result from unit propagation 
is bounded by the number of variables. To show that the search tree has size polynomial 
in the number of pigeons n, it thus suffices to show that the number of branch decisions is 
polynomial in n. We will show that under the given branch heuristic, the number of branch 
decisions is n — 1 specifically. 

To do this, we first descend into the tree through branching and propagation until a 
contradiction is reached, and describe the partial assignment that is created and show how 
a contradiction is drawn. We then show the backtracking process, proving that the empty 
clause can be derived in a single backtrack. More specifically, we show that every open 
branch of the search tree can be closed through propagation alone. No further branch 
decisions are needed. 

Lemma A.3 deals with the first n— 2 branch decisions. What about the n— 1** decision? If 
this branch decision is pjk, we again generate the set of unit propagations = {'^Pik\i 7^ j}- 
This time we will generate some additional unit propagations. Since we have assigned n — 1 
of the pigeons each to a unique hole, there is only one empty hole remaining. If this is hole 
h, the two remaining pigeons (say pigeons a and h) are both forced into hole h, while only 
one can occupy it. This leads to the expected contradiction. But now Lemma A.4 shows 
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that no further branches are necessary, so that the total number of branches is n — 1, and 
the RBL search tree is polynomially sized. □ 

Proposition 6.6 There is an augmented resolution proof of polynomial size of the unsatis- 
fiability of (36)-(40). 

Proof. The proof proceeds similarly to the proof of Proposition 6.4, although the details 
are far more intricate. As before, we will work with ground axioms only and will suppress 

the augmentation by the global symmetry group. 

The analog to pij is that the ith node of the clique gets the jth color, or 

{qn A cij) V • • • V {qim A Cmj) 

which we will manipulate in this form although it's clearly not CNF. 

Now Ai, the statement that the first pigeon is in some hole, or that the first node of 
the clique gets some color, is 

[qu A (cii V • • • V cin)] V • • • V [qim A (c^i V • • • V c^„)] 

The expression for yl/j, which was 

V (44) 

l<i<k 
k<.j<.n 

similarly becomes 

[{qn V • • • V qki) A (cife V • • • V cin)] V • • • V [{qim V • • • V qkm) A {c^k V • • • V Cmn)] (45) 

saying that for some i and j as in (44) , there is an index h such that qih A c^j ; h is the index 
of the graph node to which a clique element of a suitable color gets mapped. 

In order to work with the expressions (45), we do need to convert them into CNF. 
Distributing the A and V in (45) will produce a list of conjuncts, each of the form 

Si V • • • V 5^ (46) 

where each Bi is of the form either qij V ■ ■ ■ V q'fci or Cj^ V ■ ■ ■ V Cj^. There are 2"* possible 
expressions of the form (46). 

Each of these 2"* expressions, however, is an instance of the result of acting with the 
global group G on one of the following: 

(Cifc V • • • V Cin) V {C2k V • • • V C2n) V • • • V (c^fe V 
(gil V • • • V gfei) V (C2fc V ■ ■ ■ V C2„) V ••• V (cVnfeV 

(giiV---Vgfei) V (gi2 V • • • V gfe2) V ■•• V (gi^V 

We will view these as all instances of a general construct indexed by h, with h giving the 
number of initial clauses based on the g's instead of the c's. So the first row corresponds to 



■ ■ ■ V Cmn) 

■ ■ ■ V Cmn) 



(47) 

V^fem) 
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h = 0, the second to h = 1 and so on, with the last row corresponding to h = m. It follows 
from this that is effectively 



A V '^'^^ V 

0<h<m \ l<i<k h+l<i<m , 

\l<j<h k<j<n / 



(48) 



It is important to realize that we haven't actually started to prove anything yet; we're 
just setting up the machinery needed to duplicate the proof of Proposition 6.4. The only 
remaining piece is the analog in this framework of the axiom ^Pij V —<Pkj, saying that each 
hole can only contain one pigeon. That is 

-•Qih V -^Chj V -^Qkg V -iCgj (49) 

saying that if node i of the clique is mapped to node h of the graph, and k is mapped to g, 
then g and h cannot both get the same color. 

If g ^ h, we can derive (49) by resolving (36) and (40). li g = h, then (49) becomes 
'^Qih V ^Cfij V -^Qkh and is clearly a weakening of (39) . Thus -^pij V -^Pkj becomes the pair of 
clauses 

-"Qih V -^Chj V -^Qkg V -'Cgj 

both of which can be derived in polynomial time and are, as usual, acted on by the group 
G. We are finally ready to proceed with the main proof. 

For the base step, we must derive Ai, or the conjunction of 

V ^^i^ V 

l<i<l h+l<i<m 
l<j<h l<3<n 

which is to say 

(Cii V • • • V Ci„) V (C21 V • • • V C2n) V ••• V (Cmiy ■■■VCmn) 
qil V (C2l V • • • V C2n) V • • • V {Cml V ■ ■ ■ V Cmn) 

qii V qi2 V ••• V qim 

Except for the last row, each of these is obviously a weakening of (37) saying that every 
node in the graph gets a color. The final row is equivalent to (38) saying that each element 
of the clique gets a node in the graph. 

For the inductive step, we must show that A^ h A^+i- Some simplifying notation will 
help, so we introduce 

Cij..k = {cij V • • • V Cjfe) 

and 

Qi..jk = (%fe V • • • V Qjk) 
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Now Ak as in (47) is actually 

Clk..n V C2k..n V ■ ■ ■ V Cjnk..n 

Ql..kl V C2k..n V ■ ■ ■ V Cjnk..n , , 

(50) 

Ql..fel V Ql..fe2 V ■ ■ ■ V Ql..km 

Following the pigeonhole proof, we need to reduce the number of holes (i.e., colors) by one 
and increase the number of pigeons (i.e., clique elements) by one. For the first step, we need 
to resolve away the appearances of Cj^ from (50). We claim that from (50) it is possible to 
derive 

Cl,fc+l..n V C2,k+l..n V ■ ■ ■ V Cm,k+l..n V ^qk+l,m V "iC^fc 

<3l..fcl V C2,k+l..n V ■ ■ ■ V Cm,k+l..n V -^qk+l,m V -'Cmk 

. (51) 

Ql..kl V Ql..k2 V ■ ■ ■ V Ql..km V -igfc+l,m V ~'Cmfe 

We show this by working from the bottom of the arrays (/i = m in the description (48)) to 
the top (/i = 0). For the last row, the expression in (51) is clearly a weakening of the final 
row in (50). 

Suppose that wc have done h = i and are now considering h = i — 1 corresponding to 
the derivation of the disjunction 

Ql..k\ V ■ ■ ■ V Q\..k,i-1 V Ci^k+l..n V • • • V Cm,k+l..n V ^qk+l,m V ^Cmk 

from 

Ql.Ml V • • • V Ql..k,i-l V Cik..n V • • • V Cjnk..n (52) 

Now recall that Cjk..n is 

n 

and we want to remove the cj^ term by resolving with (49), adding literals of the form 
~'Qk+i,m V -^Cmk- The instance of (49) that we use is 

^qk+i,m V -ic^fe V ^qij V -iCjfc 

and the resolvent is 

Cj,fe+l..n V -'gfe+l,m V ->Cjnk V -^qij 

The result of resolving into (52) is therefore 

<9l..fclV- • •V(3i..fc,j_iV(7ifc..„V- • •VCj_i,fe..„VCj,fe+i..„V(7j + i,fe..„V- • ■yCmk..ny~'(lk+l,my~'Cmky^<llj 

(53) 

for any / and i < j < m. 

Now we have already done h = i, which means that we have derived 

Ql..kl V • • • V Ql..ki V Ci+i^k+l..n V • • • V Cm,k+l..n V -^qk+l,m V "'Cmfc 

Operating on this with the usual group allows us to exchange the q and c variables arbitrarily, 
so it matters not that the first i terms involve the q variables and the last m — i involve the 
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c variables, but only that the number of terms involving q and c variables are i and m — i 
respectively. Thus we have also derived 

— l,fe+l..n ,k-\-l..n mk 

(54) 

(where we have essentially swapped node i and node j in (53)). By taking I = 1, . . . , k in 
(53), we can resolve (54) with (53) to eliminate both the trailing -ig;j in (53) and the Qi,,kj 
term above. Since the literals in Ch,k+i..n are a subset of those in Chk..n, we get 

Ql..felV- ■ ■V(5i..fe^j_iVCjfe..„V- • •V(7j_i^fc..„V(7j-_fe+i..„VCj_|_i_fe..„V- • ■\/Cmk..n^~'<lk+l,m'^~'Cmk 

We can continue in this fashion, gradually raising the second index of each C term from k 
to A; + 1, to finally obtain 

Ql..kl V • • • V Ql..k,i-l V Ci,fc+l..n V • • • V Cm,k+l..n V -^qk+l,m V -^Cmk (55) 

as in (51). 

The hard part is now done; we have exploited the symmetry over the nodes and it 
remains to use the symmetry over the colors. In the derivation of (55), the color k in the 
final —'Cmk is obviously irrelevant provided that it is chosen from the set 1, . . . ,k; higher 
numbered colors (but only higher numbered colors) already appear in the Cj_fe+i..n- So we 

have actually derived 

Ql..kl V • • • V Ql..k,i-1 V Ci^k+l..n V • • • V Cm,k+l..n V -<qk+l,m V -iC^fc 

Ql..kl V • • • V Ql..k,i-1 V Cj,fc+l..n V • • • V Cm,k+l..n V ^qk+l,m V ^Cm2 
Ql..fel V • • • V Ql..k,i-1 V Ci^k+l..n V ■ ■ ■ V Cm,k+l..n V ^qk+l,m V -'Cml 

and when we resolve all of these with the domain axiom (37) 

Cml V Cm2 V • • • V 

we get 

Ql..kl V • • • V Ql..k,i-1 V Ci^k+l..n V • • • V Cm,k+l..n V ^qk+l,m V Cm,k+1 V • • • V Cmn 

which is to say 

Ql..kl V ■ ■ ■ V Ql..k,i-1 V Ci^k+l..n V • ■ • V Cm,k+l..n V -igfe+l,/?! V Cm,k+l..n 

or 

<5l..A;l V • • • V V Ci,fe+l..n V ■ ■ ■ V Cm,k+l..n V -i^fc+l.m 

Now the m subscript in the final q variable is also of no import, provided that it remains 
at least i. We can therefore resolve 

Qi..ki V • • • V Qi..k,i-i V Cj,fc+i..„ V • • • V Cm,k+i..n V -'qk+i,i 

,k+l..n V -^qk+1 
V Cm,k+l..n V ~'(?fc+l,m 
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with the domain axiom (38) 

qk+1,1 V ■ ■ ■ V qk+l,m 

to get 

Ql..kl V • • • V Ql..k,i-1 V Ci^k+l..n V • • • V Cjn^k+l..n V gfe+l,! V ■ ■ ■ V qk+l,i-l 

which is to say 

Ql..k+l,l V • • ■ V Ql..k+l,i-l V Ci^k+l..n V ■ • • V Cm,k+l..n 

This is Ajt+i as desired. 

It remains to show that we can derive a contradiction from An- If we continue with the 

above procedure in an attempt to "derive" ^n+i, when we derive an instance of (55), the 
C terms wiU simply vanish because k + 1 > n. We end up concluding 

V • • • V Ql..k,i~l V -'qk+l,m V -'Cjnk 

and the i = instance is simply 

All of the indices here are subject to the usual symmetry, so we know 

^Qji V -.Cjl 



which we resolve with Cji V • • • V Cj^ to get ^qji- We can resolve instances of this with 
qmi V • ■ ■ V qmn to finally get the desired contradiction. □ 

Lemma 6.7 Let C be a theory consisting entirely of parity clauses. Then determining 

whether or not C is satis fiable is in P. 

Proof. The proof is essentially a Gaussian reduction argument, and proceeds by induction 
on n, the number of variables in C. If n = 0, the result is immediate. So suppose that C 
contains n + 1 variables, and let one clause containing xi be 

Xl + X = k 

where k = oi k = 1. This is obviously equivalent to 

Xl = X + k 

xes 

which we can now use to eliminate xi from every other axiom in C in which it appears. 
Since the resulting theory can be tested for satisfiability in polynomial time, the result 
follows. □ 



528 



ZAP 2: Theory 



Lemma A. 5 Suppose that we have two axioms given by 

xi + X + X = ki (56) 
xeS xeTi 

and 

xi + ^2 ^ + ^2 ^ = ^2 (57) 

xeS xeT2 

where the sets S, Ti and T2 are all disjoint. Then it follows that 

^x+^x = ki + k2 (58) 
xeTi xeT2 

and furthermore, any proof system that can derive this in polynomial time can also determine 
the satisfiability of sets of parity clauses in polynomial time. 

Proof. Adding (56) and (57) produces (58). That this is sufficient to solve sets of parity 
clauses in polynomial time is shown in the proof of Lemma 6.7. □ 

Lemma 6.9 Fs <Wn. 

Proof. Fs is closed under inversion, since every element in Fs is its own inverse. To see 
that it is closed under composition as well, suppose that /i flips the variables in a set and 
/2 flips the variables in a set S2. Then /1/2 flips the variables in S12 = SiU S2 — {Si n ^2). 
But now 

\Si2\ = \SiuS2-{SinS2)\ 
= |5i u52| - 1^1 n52| 
= |5i| + |52|-|5in52|-|5in52| 
= \Si\ + \S2\-2-\SinS2\ 

and is therefore even, so /1/2 € Fg. □ 

Lemma 6.10 Let S = {xi, . . . ,Xk} be a subset of a set of n variables. Then the parity 
clause 

k 

^Xi = l (59) 

i=l 

is equivalent to the augmented clause 

(xi V--- Vxfc,F5) (60) 

The parity clause 

k 

i=l 

is equivalent to the augmented clause 

(-X1 Var2 V--- VXjfc,F5) 
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Proof. To sec that (59) implies (60), note that (59) certainly implies .ti V • • • V x^. But 
the result of operating on the disjunction with any element of Fs flips an even number of 
elements in it, so (60) follows. 

For the converse, suppose that (59) fails, so that an even number of the Xj's are true. 
The disjunction that flips exactly those Xj's that are true will obviously have no satisfied 
literals, but will have flipped an even number of elements of S so that some instance of the 
augmented clause (60) is unsatisfied. 

The second equivalence clearly follows from the first; replace xi with -ixi. □ 

Proposition 6.11 Let C be a theory consisting entirely of parity clauses. Then determining 

whether or not C is satisfiable is in P for augmented resolution. 

Proof. We need to show that the conditions of Lemma A. 5 are met. We can assume 
without loss of generality that ki = 1 and A;2 = in the conditions of the lemma; other 
cases involve simply flipping the sign of one of the variables involved. 
In light of Lemma 6.10, we have the two augmented axioms 

(xi V Y xV y x,Fj,ju5uTi) 
xes xeTi 

and 

{^xi V Y x V Y x,F^-^uSuT2) 
where S, Ti, and T2 are all disjoint. The clause obtained in the resolution is clearly 

V - 

xeSUTiUT2 

but what is the group involved? 

The elements of the group are the stable extensions of group elements from Fx^uSuTi 
and -Fa;iUSuT2; in other words, any permutation that leaves the variables unchanged and 
simultaneously flips an even number of elements of xi U S U Ti and of xi U 5 U We 
claim that these are exactly those elements that flip any subset of S and an even number 
of elements of Ti UT2. 

We first show that any stable extension g flips an even number of elements of Ti U T2 
(along with some arbitrary subset of S). If g flips an odd number of elements of TiLiT2, 
then it must flip an odd number of elements of one (say Ti) and an even number of elements 
of the other. Now if the parity of the number of flipped elements of xi U is even, the total 
number flipped in xi U 5" U Ti will be odd, so that g does not match an element of -Fa;iu5uri 
and is therefore not an extension. If the parity of the number of flipped elements of Xi U 5 
is odd, the number flipped in xi U 5 U r2 will be odd. 

To see that any g flipping an even number of elements of Ti U T2 corresponds to a stable 
extension, we note simply that by flipping xi or not, we can ensure that g flips an even 
number of elements in each relevant subset. Since g flips an even number of elements in 
Ti U r2, it flips subsets of Ti and of T2 that have the same parity. So if it flips an even 
number of elements of 5 U Ti, it also flips an even number of elements of 5 U r2 and we 
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leave xi unflipped. If g flips an odd number of elements of 5 U Ti and of 5 U we flip xi. 

Either way, there are corresponding elements of Fx-^^uSuTi and FxiUSuT2- 

Now suppose that we denote by Ks the group that flips an arbitrary subset of S. We 
have shown that the result of the resolution is 

V x,KsxFt,utA (61) 

yxeSLlTiUT2 J 

Note that the resolution step itself is polytime, since we have given the result explicitly 
in (61).i8 

Next, we claim that (61) implies 

y x,KsxFt,utA (62) 

where we have removed the elements of S from the disjunction. 

We prove this by induction on the size of S. If 5 = 0, the result is immediate. Otherwise, 
if a G S for some specific a, two instances of (61) are 

a V y x,Ksx FriUT2 

3;GS-{a}UTiUr2 

and 

-.a V Y x,KsX FtiuT2 
which we can resolve using the stability property (5) of Definition 5.4 to conclude 

V 

i^xeS-{a}LlTiUT2 

SO that (62) now follows by the inductive hypothesis. 

At this point, however, note that the variables in S do not appear in the clause in (62), 
so that we can drop Ks from the group in the conclusion without affecting it in any way. 
Thus we have concluded 

V x^FnurA (63) 

^a;eriUT2 / 

Applying Lemma 6.10 once again, we see that (63) is equivalent to 
as needed by Lemma A. 5. The proof is complete. □ 



18. In general, augmented resolution is not known to be polynomial in the number of generators of the 
groups in question. But it is polynomial for groups of restricted form being considered here. 



531 



Dixon, Ginsberg, Luks & Parkes 



References 

Babai, L. (1986). On the length of subgroup chains in the symmetric group. Comm. Algebra, 
14{9), 1729-1736. 

Babai, L., Luks, E., & Seress, A. (1997). Fast management of permutation groups I. SIAM 

J. Computing, 26, 1310-1342. 

Barth, P. (1995). A Davis-Putnam based enumeration algorithm for linear pseudo- 
boolean optimization. Tech. rep. MPI-I-95- 2-003, Max Planck Institut fiir Informatik, 
Saarbriicken, Germany. 

Barth, P. (1996). Logic-Based 0-1 Constraint Programming, Vol. 5 of Operations Re- 
search/Computer Science Interfaces Series. Kluwer. 

Baumgartncr, P., k Massacci, F. (2000). The Taming of the (X)OR. In Lloyd, J., Dahl, 
v., Furbach, U., Kerber, M., Lau, K.-K., Palamidcssi, C, Pereira, L. M., Sagiv, Y., 
& Stuckey, P. J. (Eds.), Computational Logic - CL 2000, Vol. 1861, pp. 508-522. 
Springer. 

Bayardo, R. J., k, Miranker, D. P. (1996). A complexity analysis of space-bounded learning 
algorithms for the constraint satisfaction problem. In Proceedings of the Thirteenth 

National Conference on Artificial Intelligence, pp. 298-304. 

Bayardo, R. J., & Schrag, R. C. (1997). Using CSP look-back techniques to solve real-world 
SAT instances. In Proceedings of the Fourteenth National Conference on Artificial 
Intelligence, pp. 203-208. 

Chandru, V., k Hooker, J. N. (1999). Optimization Mehtods for Logical Inference. Wiley- 
Inter science. 

Cook, W., Coullard, C, k Turan, G. (1987). On the complexity of cutting-plane proofs. 

Discrete Applied Mathematics, 18, 25-38. 

Crawford, J. M., Ginsberg, M. L., Luks, E., k Roy, A. (1996). Symmetry breaking predicates 
for search problems. In Proceedings of the Fifth International Conference on Principles 
of Knowledge Representation and Reasoning, Boston, MA. 

Dixon, H. E., &; Ginsberg, M. L. (2000). Combining satisfiability techniques from AI and 
OR. Knowledge Engrg. Rev., 15, 31-45. 

Dixon, H. E., Ginsberg, M. L., Hofcr, D., Luks, E. M., k Parkes, A. J. (2004a). Generalizing 
Boolean satisfiability III: Implementation. Tech. rep., On Time Systems, Inc., Eugene, 
Oregon. 

Dixon, H. E., Ginsberg, M. L., k Parkes, A. J. (2004b). Generalizing Boolean satisfiability 
I: Background and survey of existing work. Journal of Artificial Intelligence Research, 
21, 193-243. 

Ginsberg, M. L. (1993). Dynamic backtracking. Journal of Artificial Intelligence Research, 

1, 25-46. 

Ginsberg, M. L., k Parkes, A. J. (2000). Search, subscarcli and QPROP. In Proceedings of 
the Seventh International Conference on Principles of Knowledge Representation and 
Reasoning, Breckenridge, Colorado. 



532 



ZAP 2: Theory 



Guignard, M., & Spielberg, K. (1981). Logical reduction methods in zero-one programming. 
Operations Research, 29. 

Haken, A. (1985). The intractability of resolution. Theoretical Computer Science, 39, 297- 
308. 

Harrison, M. A. (1989). Introduction to Switching and Automata Theory. McGraw-Hill. 

Hooker, J. N. (1988). Generalized resolution and cutting planes. Annals of Operations 
Research, 12, 217-239. 

Jerrum, M. (1986). A compact representation for permutation groups. J. Algorithms, 7, 
60-78. 

Knuth, D. E. (1991). Notes on efficient representation of permutation groups. Comhinator- 
ica, 11, 57-68. 

Krishnamurthy, B. (1985). Short proofs for tricky formulas. Acta Informatica, 22{3), 253- 
275. 

Li, C. M. (2000). Integrating equivalency reasoning into Davis-Putnam procedure. In 
Proceedings of the Seventeenth National Conference on Artificial Intelligence, pp. 291- 
296. 

Luks, E. M. (1993). Permutation Groups and Polynomial- Time Computation, Vol. 11 of 
DIM ACS Series in Discrete Mathematics and Theoretical Computer Science, pp. 139- 
175. Amer. Math. Soc. 

Mclver, A., & Neumann, P. (1987). Enumerating finite groups. Quart. J. Math., 38{2), 
473-488. 

Moskewicz, M., Madigan, C., Zhao, Y., Zhang, L., & Malik, S. (2001). Chaff: Engineering 
an efficient SAT solver. In 39th Design Automation Conference. 

Nemhauser, G., h Wolsey, L. (1988). Integer and Combinatorial Optimization. Wiley, New 
York. 

Pudlak, P. (1997). Lower bounds for resolution and cutting planes proofs and monotone 
computations. J. Symbolic Logic, 62{2>), 981-998. 

Pyber, L. (1993). Enumerating finite groups of given order. Ann. Math., 137, 203-220. 

Rotman, J. J. (1994). An Introduction to the Theory of Groups. Springer. 

Savelsbergh, M. W. P. (1994). Preprocessing and probing for mixed integer programming 
problems. ORSA Journal on Computing, 6, 445-454. 

Seress, A. (2003). Permutation Group Algorithms, Vol. 152 of Cambridge Tracts in Mathe- 
matics. Cambridge University Press, Cambridge, UK. 

Szeider, S. (2003). The complexity of resolution with generalized symmetry rules. In Alt, 
H., & Habib, M. (Eds.), Proceedings of STACS03, volume 2607 of Springer Lecture 
Notes in Computer Science, pp. 475-486. 

Tseitin, G. (1970). On the complexity of derivation in propositional calculus. In Slisenko, 
A. (Ed.), Studies in Constructive Mathematics and Mathematical Logic, Part 2, pp. 
466-483. Consultants Bureau. 



533 



Dixon, Ginsberg, Luks & Parkes 



Zhang, H., Sz Stickel, M. E. (2000). Implementing the Davis-Putnam method. Journal 
Automated Reasoning, 24{l/2), 277-296. 



534 



